Augmented exploration for big data and beyond

ABSTRACT

A computer is to obtain specification concept graphs of nodes spec 1 , spec 2 , . . . , spec m  including concept nodes and relation nodes according to at least one of a plurality of digitized data from a plurality of computerized data sources forming a first set of evidences U and obtain concept graphs of nodes cα 1 , cα 2 , . . . , cα n  including concept nodes and relation nodes for corresponding obtained plurality of information and knowledge (IKs) α 1 , α 2 , . . . , α n  forming a second set of evidences U. A subset of concept graphs of nodes is selected from cα 1 , cα 2 , . . . , cα n  according to a computable measure of consistency, inconsistency and/or priority threshold between cα j  in cα 1 , cα 2 , . . . , cα n  can to specification concept graph spec k  in spec 1 , spec 2 , . . . , spec m . Knowledge fragments are generated for corresponding subset of concept graphs cα i     1   , cα i     2   , . . . , cα i     h    include augmenting information objects by creating or adding into at least one knowledge-base (KB), new objects in form ω=E→A from the concept fragments, including a computed validity (v) and a plausibility (p) for a combination of relationship constraints    κ  for the concept fragments and obtained propositions    κ  for the fragment concepts.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patent application Ser. No. 15/584,874, filed May 2, 2017, which is based upon and claims priority benefit to prior U.S. Provisional Patent Application No. 62/331,642 filed on May 4, 2016 in the US Patent and Trademark Office, the entire contents of all of which are incorporated herein by reference.

FIELD

The embodiments discussed relate to computer implemented big data (large and complex data set), information and knowledge processing.

BACKGROUND

The ability to extract knowledge from large and complex collections of digital data and information, as well as the utilization of these information and knowledge needs to be improved.

SUMMARY

According to an aspect of an embodiment, an apparatus including a memory and a processor coupled to the memory is to augment at least one of a plurality of digitized data input from a plurality of computerized data sources d₁, d₂, . . . , d_(l) forming a first set of evidences U to represent a first knowledge base (KB) among a plurality of KBs. A subset of concept graphs of nodes cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) from cα₁, cα₂, . . . , cα_(n) is selected according to a computable measure of any one or combinations of consistency, inconsistency or priority threshold between cα_(j) in cα₁, cα₂, . . . , cα_(n) to specification concept graph spec_(k) spec₁, spec₂, . . . , spec_(m). Knowledge fragments are generated for corresponding obtained subset of concept graphs cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) to include augmenting information objects by creating or adding into at least one KB among the KBs, new objects in form ω=E→A from the concept fragments, including a computed validity (v) and a plausibility (p) for a combination of relationship constraints

_(κ) for the concept fragments and obtained propositions

_(κ) for the fragment concepts.

These and other embodiments, together with other aspects and advantages which will be subsequently apparent, reside in the details of construction and operation as more fully hereinafter described and claimed, reference being had to the accompanying drawings forming a part hereof, wherein like numerals refer to like parts throughout.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a functional block diagram of computer system functions, according to an embodiment.

FIG. 2 is a diagram of concept node graphs, according to an embodiment.

FIG. 3 is a flow chart of a unification process for the knowledge fragments, according to an embodiment.

FIG. 4 is flow chart of the unification process described in FIG. 3 with tags, according to an embodiment.

FIG. 5 is a flow chart of a unification process for the knowledge fragments without inconsistencies, according to an embodiment.

FIG. 6 is flow chart of determining projection/forecasting based upon the knowledge fragments, according to an embodiment.

FIG. 7 is a flow chart of determining abduction for the knowledge fragments, according to an embodiment.

FIG. 8 is a functional block diagram of a computer, which is a machine, for implementing embodiments of the present invention.

DETAILED DESCRIPTION OF EMBODIMENT(S)

The embodiments described in the attached description entitled “Augmented Exploration for Big Data and Beyond” can be implemented as an apparatus (a machine) that includes processing hardware configured, for example, by way of software executed by the processing hardware and/or by hardware logic circuitry, to perform the described features, functions, operations, and/or benefits.

Big Data refers to large and complex data sets, for example, data from at least two or more dynamic (i.e., modified in real-time in response to events, or updating) heterogeneous data sources (i.e., different domain and/or similar domain but different sources) which are stored in and retrievable from computer readable storage media. According to an aspect of embodiment, data can be text and/or image documents, data queried from a database or a combination thereof. The embodiments provide predictive analytics, user behavior analytics, or certain other advanced data analytics to extract value from data. More particularly, Big Data refers to the problem whereby the processing of large, complex, dynamic, and noisy data overwhelms current state of the art data processing software, thus limiting software applications to achieving only limited and/or simplistic tasks. Advanced tasks involving more complex data analytics such as semantics extraction or complex structure elicitation are in the purview of Big Data and in the search for innovative solutions, algorithms, and software application development.

According to an aspect of an embodiment, the data sources are interconnected computing devices or objects in heterogeneous domains (e.g., computers, vehicles, buildings, persons, animals, plants), as embedded with electronics, software, sensors to transceiver data over data networks including the Internet (Internet of Things (IoT)). According to another aspect of an embodiment, the data sources are Website content, including archived Website content.

Section 1. Augmented Exploration (A-Exploration)

The embodiments described provide a new and innovative approach—Augmented Exploration or A-Exploration—for working with Big Data, for example, to realize the “The White House Big Data Initiative” and more by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration is a Comprehensive Computational Framework for Continuously Uncovering, Reasoning and Managing Big Data, Information and Knowledge. In addition to the built-in next generation real-time search engine, A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolvement of these information and knowledge.

A-Exploration supplements, extends, expands and integrates, among other things, the following related technologies:

-   -   1. Large-scale Real-time Information Retrieval and Extraction         (LRIRE)     -   2. Bayesian Knowledge Base (BKB) and Compound Bayesian Knowledge         Base (CBKB)     -   3. Augmented Knowledge Base (AKB) and Augmented Reasoning (AR)     -   4. Relational Database (RDb) and Deductive Database (DDb)     -   5. Users Modeling (UM)

The integrations and extensions of the above technologies, combined with new technologies described below:

-   -   1. Knowledge Completer (KC)     -   2. Knowledge Augmenter (KA)     -   3. Augmented Analyzer (AA)     -   4. Augmented Supervisor (AS)     -   5. Proprietary Hypothesis Plug-in (PHP),

enable and empower A-Exploration to create and establish diverse methods and functionalities, capabilities to simplify and solve virtually all problems associated with Big Data, Information and Knowledge, including but not limited to:

-   -   1. Continuously Uncover and Track the Desired Information and         Knowledge in real-time.     -   2. Enable, Produce and Generate Projection/Forecasting and         Abduction.     -   3. Analysis and Deep Understanding of the Information/Knowledge.     -   4. Manage and Supervise the Progression and Evolution of the         Information/Knowledge.     -   5. Detect, Verify and Anticipate Emergent Events and Behaviors.     -   6. Locate and Unearth the Missing Links between the Current         Situations and the Eventual Desired Solutions.

Depending on the needs, other capabilities of A-Exploration may have to be supplemented, suspended, combined and/or adjusted.

The solutions and results mentioned above, require not only the extraction of enormous up-to-the-minute information, but also instantaneous analysis of the information obtained. Thus, new paradigms and novel approaches are necessary. One new paradigm includes not only the next generation scalable dynamic computational framework for large-scale information and knowledge retrieval in real-time, but also the capability to analyze immediately and seamlessly the partial and piece-meal information and knowledge acquired. Moreover, there is a need to be able to have a deep understanding of the different information and knowledge retrieved to perform the other required operations. Depending on the applications, A-Exploration can provide the necessary solution or simply provide the necessary information, knowledge and analyses to assist the decision makers in discovering the appropriate solutions and directions.

An example of a Big Data Research and Development Initiative is concerned with:

-   -   1. improving our ability to harness and extract knowledge and         insights from large and complex collections of digital data, and     -   2. through it to help solve some of the Nation's most pressing         challenges, including accelerate the pace of scientific         discovery, environmental and biomedical research, education, and         national security.

New and innovative process—Augmented Exploration or A-Exploration—is described for working with Big Data to realize the above initiative and more by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration is a Comprehensive Computational Framework for Continuously Uncovering, Reasoning and Managing Big Data, Information and Knowledge. In addition to the built-in next generation real-time search engine, A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolution of these information and knowledge. This is one of the most challenging and all-encompassing aspects of Big Data, and is now possible due to the advances in information and communication infrastructures and information technology applications, such as mobile communications, cloud computing, automation of knowledge works, internet of things, etc.

A-Exploration allows organizations and policy makers to address and mitigate considerable challenges to capture the full benefit and potential of Big Data and Beyond. It contains many sequential and parallel phases and/or sub-systems required to control and handle its various operations. Individual phases, such as extracting, uncovering and understanding, have been used in other areas and examined in details using various approaches and tools, especially if the information is static and need NOT be acquired in real-time. However, overseeing and providing possible roadmap for management, utilization and projection of the outcomes and results obtained are regularly and primarily performed by human. Comprehensive formulations and tools to automate and integrate the latter are not currently available. Since enormous information is extracted and available to the human in the loop, information overload is clearly a significant problem. Last, but not the least, there exists no inclusive framework, at present, which consider all aspects of what are required, especially from a computational point of view, which makes it not viable for automation.

A-Exploration is vital and essential in practically any field that deals with information and/or knowledge, which covers virtually all areas in this Big Data era. It involves every areas discussed in James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers, “Big data: The next frontier for innovation, competition, and productivity,” Technical report, McKinsey Global Institute, 2011, including health care, public sector administration, retail, manufacturing, personal location data; and many other specific applications, such as dealing with unfolding stories, knowledge/information sharing and discovery, personal assistance, to name just a few. As a matter of fact, A-Exploration provides a general platform, as well as, diverse and compound methods and techniques to fulfill the goal of the above White House Big Data Research and Development Initiatives and more.

To solve the problems related to the different applications of A-Exploration, an important initial phase requires the harnessing and extracting of relevant information from vast amounts of dynamic heterogeneous sources quickly and under the pressure of limited resources. Unfortunately, most standard information retrieval techniques, see discussion by Amit Singhal, “Modern information retrieval: A brief overview,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24(4):35-43, 2001, do not have the necessary capabilities and/or may not be optimized for specific requirements needed.

The representation of the data fragments (also referred to as ‘data nuggets’) extracted is another potential problem. Most widely used representations, such as at least one or any combination of keywords, relationships among the data fragments (see FIG. 2, 204), expressions, vectors, etc. present hidden obstacles for the other phases involved. Even the most advanced existing representation—the concept graphs, see discussions by John F. Sowa, “Conceptual Graphs For A Database Interface,” IBM Journal of Research and Development, 24(4):336-357, 1976, and Michel Chein and Marie-Laure Mugnier, “Graph-based Knowledge Representation: Computational Foundations of Conceptual Graphs,” Springer, 2009., suffer from the fact that it lacks reasoning capabilities to handle intelligent analyses and projections.

There are many available techniques for story understanding, see discussion in Erik T. Mueller, “Story Understanding Resources, <retrieved from Internet: http://xenia.media.mit.edu/{tilde over ( )} mueller/storyund/storyres.html>. But virtually all were intended for static events and stories, which are not suitable for dynamic and real-time information and knowledge. Moreover, in A-Exploration, what may be needed is deep understanding of the various scenarios and relationships, as well as, possible consequences of the information and knowledge, depending on the applications.

One of the most difficult problems to overcome is how to manage, utilize and supervise the knowledge and insights obtained from the other phases to resolve the possible outcomes of the situations and to project into the future. Moreover, with sufficient information and knowledge of the situations, one may be able to help shape, influence, manipulate and/or direct how they will evolve/progress in the future. Furthermore, it may help and facilitate the discovery/invention of new and improved methods for solving the problem at hand. Currently, existing framework might not provide the users with the desired solutions, or even just to assist the intended users in making the appropriate decisions or to provide guidance to change the future statuses and/or courses in the desired/preferred directions.

Since typically everything might be performed under limited resources, allocating the resources, especially time and computing power, is another prime consideration. Besides the simple allocation strategies, such as random and round robin methods, other more sophisticated strategies can also be used in A-Exploration.

Section 2. Information and Knowledge

Information and knowledge (IK) come in many different flavors, and have many different properties. Moreover, IKs may be static or dynamic. Dynamic IKs may evolve, expand and/or progress, and may be subject to modification and/or alteration. They may also be combined or integrated with other IKs. The desirability or significance of specific IKs may depend on the applications. Except for proven or well-known IKs, in general, the importance of an IK may depend on the following characteristics:

-   -   1. Timeliness of the IKs.     -   2. Development, Expansion, Progression and/or Evolution of the         IKs.     -   3. Volume and Diversity of the sources of the IKs.     -   4. Influence and Swiftness of the spreading to other IKs.     -   5. Intensity and Rapidity of the impact from other IKs.

FIG. 1 is a functional block diagram of computer system functions, according to an embodiment. Operations (0) through (9) are indicated in the diagram. The Augmented Supervisor 100 is a computer that processes inputs and commands (see factors (i) and/or (ii) below) from the user and/or another computing device. Manages, regulates, and supervises the development, progression, and evolution of information and knowledge, together with the creation of new information and knowledge. Guides the User Model (102 u) and Augmented Analyzer (112).

LRIRE 102 is the web, external databases, and other information and knowledge (IK) repositories serve as sources (102 a, 102 b, . . . , 102 n) of inputs. The inputs include structured data (e.g. forms, spreadsheets, website content, databases, etc.) and free-form data (e.g. unstructured text, live news feeds, etc.), or a combination (e.g. meta-data tags on html documents containing free text). Inputs are retrieved and organized based on any one or combination of two factors: (i) target query or domain specification (information retrieval), and (ii) specific user interests and preferences (User Modeling (102 u)) as specified by a use and/or by a computing device. Output is the organized information and knowledge (IK) according to the factors (i) and/or (ii).

At 102 x, Concept Node Graph Generator—Each piece of IK from each IK source (102 a, . . . 102 n) is translated into a concept graph representing the semantics of the IK.

At 102 y, Knowledge Fragmenter—The collection of concept graphs are used to construct knowledge fragments. Knowledge fragments identify value relationships between concepts found in the concept graphs.

At 104, Proprietary Hypothesis Plug-Ins—Information and knowledge fragments that are proprietary to the user or organization that drives hypothesis formation and how this can shape the reasoning and analysis of A-Exploration could be injected the information/knowledge bases in 106.

At 106, Knowledge-Base Injector—The fragments are added into existing (or possibly empty) data/knowledge bases of varying forms (e.g. Relation DBs (106 b), BKBs (106 a), AKBs (106 c), etc.). Injection translates the fragments into the appropriate database format. Each information/knowledge base can be directly accessed from 108 and 110.

At 108, Knowledge Completer—All/some IK is extracted from the information/knowledge bases in 106. Unification processes (with Tags (108 a—FIG. 4) and sans Inconsistencies (108 b—FIG. 5)) validates and/or creates new inductively generated knowledge that is then injected back into one or all information/knowledge bases as well as updating the User Model (102 u).

At 110, Knowledge Augmenter—Inspects information/knowledge bases and generates new knowledge through the Projector/Forecaster (110 a—FIG. 6) and the Abducer/Explainer (110 b—FIG. 7). Knowledge is injected back into one or all information/knowledge bases. Augmented knowledge is sent to the Augmented Analyzer (112) for processing.

At 112, Augmented Analyzer—Augmented analysis can produce new or existing IK such as through Deep Comprehender (112 a), Explorer of Alternative Outcome (112 b), Missing Link Hypothesizer (112 d), detect unexpected or emergent phenomenons such as through the Emergence Detector (112 c) as well as other innovative technologies unique to Augmented Exploration. The output provides new augmented analysis which is also injected into the information/knowledge bases and to the Augmented Supervisor (100) for evolving stories and continuous analyses.

FIG. 1 pertains to new integration of related technologies (102 and 106) and new technologies (100, 104, 108, 110, and 112). More particularly, for example, according to an aspect of an embodiment, a computer system including at least one computer is configured to generate, at operation 2, specification concept graphs of nodes spec₁, spec₂, . . . , spec_(m) including concepts node and relation nodes according to at least one of a plurality of digitized data from user input (at operations 0, 1) from a plurality of computerized data sources d₁, d₂, . . . , d_(l) forming a first set of evidences U; generate, at operation 2, concept graphs of nodes cα₁, cα₂, . . . , cα_(n) including concept nodes and relation nodes for corresponding obtained plurality of IKs α₁, α₂, . . . , α_(n) (at operations 0, 1) forming a second set of evidences U;

At operation 2, select a subset of concept graphs of nodes cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) from cα₁, cα₂, . . . , cα_(n) according to a computable measure of any one or combinations of consistency, inconsistency or priority (see equations (1)-(3)) threshold between each cα_(j) in cα₁, cα₂, . . . , cα_(n) to each specification concept graph spec_(k) in spec₁, spec₂, . . . , spec_(m); generate knowledge fragments for corresponding obtained subset of concept graphs cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) . A knowledge fragment among the knowledge fragments storing a mapping of values to first and second sets of evidences U, where A is a rule among rules in the knowledge base, and E is a subset of the first and second sets of evidences U from the knowledge base that supports the rule A, so that the rule A is supportable by the subset of evidences E, according to the concept fragments.

At operation 3, the at least one computer is further configured to create or add into at least one among a plurality of knowledge-bases (KBs) for the corresponding knowledge fragments obtained by creating objects in form ω=E→A from the concept fragments; determining relationship constraints

_(κ) in form of set relations among a plurality of subsets of evidences E for a plurality of the objects ω. At operation 4, authorized proprietary hypothesis secured plug-in may provide other knowledge fragments.

At operation 5, any one or combination of knowledge completion functions of unification, including unification with tags and sans inconsistencies, is performed based upon the created objects ω so that a validity (v) and a plausibility (p) based upon atomic propositions among the rules A is computed for each object ω=E→A. Other knowledge augmenting functions based upon the created objects ω include, at operation 7, generating new knowledge through the Projector/Forecaster (110 a—FIG. 6) and the Abducer/Explainer (110 b—FIG. 7). Knowledge is injected back into one or all information/knowledge bases. At operation 8, Augmented knowledge may be further provided to the Augmented Analyzer (112) for processing through Deep Comprehender (112 a), Explorer of Alternative Outcome (112 b), Missing Link Hypothesizer (112 d), detect unexpected or emergent phenomenons such as through the Emergence Detector (112 c), which, at operation 9, may be further output in form of reports or user interface for further processing and re-input at operation 0.

The plurality of data sources 100 (100 _(a . . . n)) store information and knowledge (IK) which are dynamic (i.e., modified in real-time in response to events, or updating) heterogeneous data (i.e., different domain and/or similar domain but different sources) in form of text and/or image. In case of ‘knowledge,’ the information includes meta data or semantic data indicative or representing a target knowledge. The concept graph generator 102 generates concept node graphs.

To facilitate continuous uncovering and tracking, the desired IKs are normally represented using concept graphs. In a non-limiting example, FIG. 2 is a diagram of an example concept node graph 200 for hydraulic fracturing domain. In FIG. 2, the knowledge fragment 204 includes two keywords and a relation among the two keywords. The embodiments are not limited to a domain and data from any domain or combinations of domains can be utilized. Other representations, if needed, will be superimposed subsequently to enable and/or enhance the performance of other operations of A-Exploration.

Concept graph (CG) is a directed acyclic graph that includes concepts nodes and relation nodes. Consistency between two concept graphs, q and d, can be determined by a consistency measure: cons(q,d)=n/(2*N)+m/(2*M),  (1)

where n, m are the number of concept and relation nodes of q matched in d, respectively, and N, M are the total number of concept and relation nodes in q. If a labeled concept node c occurs in both q and d, then c is called a match. Two labeled relation nodes are said to be matched if and only if at least one of their concept parents and one of their concept children are matched and they share the same relation label. In comparing two CGs that differ significantly in size (for example, a CG representing an entire document and another CG representing a user's query), the number of concept/entity nodes and relation nodes in the larger CG instead of the total number of nodes in both CGs, are used.

Equation 1 could be modified whenever necessary to provide more efficient measures to prioritize the resource allocations. For example, an inconsistency measure could be defined as follow: incons(q,d)= n /(2*N)+ m /(2*M),  (2)

where n, m are the number of concept and relation nodes of q with no match in d, respectively, and N, M are the total number of concept and relation nodes in q, respectively. The priority measure is then given by: prty(q,d)=cons(q,d)−incons(q,d),  (3)

By working strictly with labeled graphs, instead of general graph isomorphism, problems associated with computational complexity is avoided.

It is assumed that each IK has at least two time-stamps: IK time-stamp, i.e., when the IK first appeared (initial or first version), and a representation time stamp, i.e., when the IK was last represented or modified (updated, second (new or subsequent) versions) in the system, also referred to as data version events. The latter is needed in connection with allocation strategies. For the sake of quick referencing, in addition to the representation time-stamp, a copy of the IK time-stamp is also available in the representation.

If the IKs came from the same or affiliated sources, then all those IKs will be combined and all contradictions resolved in favor of the later or latest versions.

Let the combined IK α consist of IKs α₁, α₂, . . . , α_(n) from the same or affiliated sources with IK time-stamps t₁, t₂, . . . , t_(n), respectively, where t₁≤t₂≤ . . . ≤t_(n). Then we shall say that α has IK time-stamp t_(n). At time t, the diversity D(α, t) of α is the average of all e^(t−t) ^(i) , and the volume V(α, t) of α is the sum of all e^(t−t) ^(i) , where i=1, 2, . . . , n.

Any IK which is not a part of a combined IK is a stand-alone IK. For a stand-alone IK α, at time t, the diversity D(α, t) and the volume V(α, t) of α are both equal to e^(t−t) ⁰ , where t₀ is the IK time-stamp of α.

The e^(t−t) ^(i) above represents the decaying factor of the IK, and depends on the unit of time used. Thus, the unit of time should be selected appropriately, and should be based on how fast-moving the IKs are.

Let α₁ and α₂ be two IKs where α₁ and α₂ need not be distinct. The consistency measure m(α₁, α₂, t) between α₁ and α₂ at time t is given by: m(α₁,α₂ ,t)=cons(r ₁ ,r ₂)×e ^(t−min(t) ¹ ^(,t) ²⁾ ,  (4)

where r₁ and r₂ are the latest representation of α₁ and α₂, respectively, and t₁ and t₂ are IK time-stamps of α₁ and α₂, respectively.

Let α be a (combined or stand-alone) IK.

-   -   1. For time t, the diversity D (α, t) of α is the sum of all         m(α, α₀, t)×D(α, t) over all other combined or stand-alone IKs         α₀.     -   2. For time t, the volume V(α, t) of a is the sum of all m(α,         α₀, t)×V(α, t) over all other combined or stand-alone IKs α₀.

The diversity and volume of an IK also measure the timeliness and up-to-the minute changes of the IK. Thus, diversity and volume can be used to measure the importance and/or the significance of the IKs as they progress or evolve.

One of the disadvantages of the diversity and volume discussed above is that it is biased against new IKs. Fortunately, this could be partially mitigated by the proper selection of the time unit so that new and recent IKs will automatically be assigned a proportionally larger diversity and volume.

For any IK α, there could be a lot of extraneous noises associated with α. To eliminate those noises and to cut down on the required computations, instead of including all other IKs, the diversity and volume will be determined only by those other IKs whose consistency measures with α are above a certain threshold. We shall refer to these as the truncated diversity and volume.

Since it is unlikely that one will be interested in all the IKs put forward by someone somewhere, it is advantageous to restrict the scope of the search. One such method is to use “watchwords” or “watch structures” to eliminate most of the unwanted IKs. (Watch structures are semantic structures consisting of watchwords, e.g., the query used in information retrieval. They can further sharpen the search for the desired IKs.) However, such screening could potentially overlook key IKs which may seem or is initially unrelated to the desired IKs. To alleviate the situation, one could widen the collection of watchwords or watch structures, as well as, adding watchwords or watch structures which may not be related but has a history of affecting and/or being affected by the IK in question. Historical semantic nets could be constructed in this regard based on historical connections among different ideas, concepts and processes.

According to the embodiments, metrics (Diversity and Volume) are provided to uncover and identify the desired IKs. However, to understand the significance and importance of these IKs, richer structural representations of the IKs are needed. Therefore, in the described framework, concept graphs are transformed into knowledge fragments, e.g., Bayesian Knowledge Fragments (BKF), Compound Bayesian Knowledge Fragments (CBKF), Augmented Knowledge Fragments (AKF), etc.

Section 3. Large-scale Real-time Information Retrieval and Extraction (LRIRE)

As mentioned above, one of the major challenges that need to be addressed is the harnessing and extracting of relevant knowledge and insights from vast amounts of dynamic heterogeneous sources quickly and under the pressure of limited resources. To meet this challenge, the capabilities offered Large-scale Real-time Information Retrieval and Extraction (LRIRE) could be integrated with the other components of A-Exploration. However, the embodiments are not limited to LRIRE and other related computational frameworks, for example, related Anytime Anywhere Dynamic Retrieval (A²DR) can be utilized in connection with A-Exploration described herein. LRIRE is an example of one of the next generation scalable dynamic computational frameworks for large-scale information and knowledge retrieval in real-time, which is described in an article by Eugene Santos, Jr., Eunice E. Santos, Hien Nguyen, Long Pan, and John Korah, “A largescale distributed framework for information retrieval in large dynamic search spaces,” Applied Intelligence, 35:375-398, 2011.

The LRIRE incorporates various state-of-the-art successful technologies available for large-scale data and information retrieval. It is built by supplementing, extending, expanding and integrating the technologies initiated in I-FGM (Information Foraging, Gathering and Matching), see discussion by Eugene Santos, Jr., Eunice E. Santos, Hien Nguyen, Long Pan, and John Korah, “A Largescale Distributed Framework For Information Retrieval In Large Dynamic Search Spaces,” Applied Intelligence, 35:375-398, 2011.

LRIRE's goal is to focus on the rapid retrieval of relevant information/documents from a dynamic information space and to represent the retrieved items in the most appropriate forms for future processing. With LRIRE, the information/documents in the search spaces are selected and processed incrementally using an anytime-anywhere intelligence resource allocation strategy and the results are provided to the users in real-time. Moreover, LRIRE is capable of exploiting users modeling (UM) to sharpen the inquiry in an attempt to dynamically capture the target behavior of the users.

In LRIRE the partial processing paradigm is enhanced by using highly efficient anytime-anywhere algorithms, which produces relevancy approximations proportional to the computational resources used. Anytime-anywhere algorithms were designed for both text and image documents: “Anytime refers to providing results at any given time and refining the results through time. Anywhere refers to incorporating new information wherever it happens and propagating it through the whole network.”

LRIRE is an intelligent framework capable of incrementally and distributively gathers, processes, and matches information fragments in large heterogeneous dynamic search spaces, using anytime-anywhere technologies, as well as converting from one representation to another, when needed. The primary purpose of LRIRE is to assist the users at finding the pertinent information quickly and effectively, and to store them using the desired representations. Initially, the graphical structure of document graphs is ideal for anytime-anywhere processing as the overhead for adding new nodes and relations is minimal.

LRIRE is capable of handling retrieval of multiple data types, including unstructured texts (typical format for most databases/sources), images, signals, etc.

In LRIRE, the information acquired is initially represented as document graphs (DG). A DG is essentially a concept graph (CG). However, other representations, such as knowledge fragments, will be superimposed subsequently to make possible the necessary operations and enhance the performance of other aspects of LRIRE.

Multiple queries can proceed in parallel in LRIRE, but in this case, it is advisable to identify the queries.

LRIRE uses various common representations of information for heterogeneous data types. Thus, LRIRE provides a seamless integration of text, image and signals through various unifying semantic representation of contents.

Section 4. Bayesian Knowledge Base (BKB) and Compound Bayesian Knowledge Base (CBKB)

As mentioned above, the information and knowledge available in LRIRE are initially represented using CGs. To equip the accumulated knowledge with the ability to reason, the CGs may be transform into knowledge fragments, e.g., Bayesian Knowledge Fragments (BKF)s, Compound Bayesian Knowledge Fragments (CBKF)s and Augmented Knowledge Fragments (AKF)s, if needed. BKFs, CBKFs and AKFs are subsets of Bayesian Knowledge Base (BKB), Compound Bayesian Knowledge Base (CBKB) and Augmented Knowledge Base (AKB), respectively.

A Bayesian Knowledge Base (BKB), see discussion by Eugene Santos, Jr. and Eugene S. Santos, “A Framework For Building Knowledge-bases Under Uncertainty, “Journal of Experimental and Theoretical Artificial Intelligence,” 11:265-286, 1999 is a highly flexible, intuitive, and mathematically sound representation model of knowledge containing uncertainties.

To expand the usefulness of BKB in support of A-Exploration, a new object is developed and employed which is referred to as Compound Bayesian Knowledge Base (CBKB). In an CBKB, each S-node may be assign multiple values. Each value represents certain aspect of the rule and therefore is capable of more accurately reflect the different types of uncertainties involved.

For any given CBKB, depending on the needs, it can be considered whether just one specific value in the S-node, or a combination of certain specific group of values in the S-node. In this case, the resulting structure is similar to a BKB. However, it may or may not be an actual BKB, since it may or may not satisfy the “exclusivity” constraint of BKB. As a matter of fact, many BKB-like objects could be derived from a single CBKB.

The fusion technology available in BKB is expanded for use in CBKB. This is essential in connection with the LRIRE cited above, which forms a major component of A-Exploration. This allows the representation of the information fragments obtained by LRIRE at different instances and with diverse queries to be combined to form a joint CBKB.

The properties and structures of CBKBs are further enriched by encoding and distinguishing the node in a CBKB according to the types of relation given in the concepts or feature graphs. Besides the relation ‘is-a’, other relations exist between two concepts, such as ‘is-located’, ‘is-colored’, etc. For instance, using the relation ‘is-located’, we can view the node representing a location as a ‘location’ node.

Section 5. Augmented Knowledge Base (AKB) and Augmented Reasoning (AR). Augmented Knowledge Base (AKB) and Augmented Reasoning (AR) are discussed in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference.

The following notations, definitions and results are described:

-   -   1.         represents a collection of propositions L closed under the         operations: ′ (not), ∧ (and), and ∨ (or).     -   2. ∅ and U are the empty set and the universal set—this could be         set of anything under consideration, respectively.     -   3. If G is a subset of U, then G′ is the complement of G, i.e.,         G′={x∈U|x∉G}.     -   4. T and F represent the logical constants TRUE and FALSE,         respectively.     -   5. If S is a set, then |         | is the cardinality of S, and 2^(S) is the collection of all         subsets of S.

An Augmented Knowledge Base (AKB) (over

and U) is a collection of objects of the form (E, A), also denoted by E→A, where E∈

and E is the body of evidence that supports the proposition or rule A. E is a subset of some universal set U and

is a collection of propositions or rules A or a first order logic.

Various reasoning schemes can be used with AKBs, including Augmented Reasoning (AR), which is a new reasoning scheme, based on the Constraint Stochastic Independence Method, for determining the validity and/or plausibility of the body of evidence that supports any simple or composite knowledge obtainable from the rules/knowledge that is contained in the AKB, i.e.,

.

As a matter of fact, AKBs encompass most existing knowledge bases including their respective reasoning schemes, e.g., probabilistic logic, Dempster-Shafer, Bayesian Networks, Bayesian Knowledge Bases, etc. may be viewed or reformulated as special cases of AKBs. Moreover, AKBs and AR have pure probabilistic semantics and therefore not subject to any anomalies found in most of the existing knowledge bases and reasoning schemes. AKBs and AR are not only capable of solving virtually all the problems mentioned above, they can provide additional capabilities, e.g., inductive inferences with uncertainties and/or incompleteness, extraction of new knowledge, finding the missing link, augmented relational databases, augmented deductive databases, augmented inductive databases, etc. related to A-Exploration.

The A in (E→A) is a rule in κ, which is similar to a proposition or rule in a traditional knowledge base. On the other hand, the E in (E→A) is a set, which represents a body of evidence, and is not a rule.

Let κ be an AKB.

-   -   1.         _(κ)={A|(E→A)∈κ for some E}.     -   2.         _(κ) is the collection of all relations among the sets or bodies         of evidence involved in κ. In other words,         _(κ) is the collection of all relations or constraints         including, for example, any one or combination of subset,         disjoint (e.g., empty intersection) partition, imposed on κ.     -   3. ε_(κ)={G⊆U|G occurs in some E where (E→A)∈κ, or G occurs in         _(κ)}.

Let κ be an AKB. {tilde over (ε)}_(κ) is the smallest collection of subsets of U that contains all (approximately all) the sets in ε_(κ) and is closed under complement, union, and intersection operators.

Measure (probabilistic or not) are usually associated with collection of evidence to specify their strength. Moreover, they are usually extended to cover {tilde over (ε)}_(κ). In the case where the measure is probabilistic, it can be interpreted as the probability that L is true.

Let κ be an AKB. {circumflex over (κ)} is defined recursively as follows:

-   -   1. (∅→_(κ)F), (U→_(κ)T)∈{circumflex over (κ)}     -   2. if ω=(E→L)∈κ then (E→_(κ)L)∈{circumflex over (κ)}.     -   3. if ω₁=(E₁→_(κ)L₁), ω₂=(E₂→_(κ)L₂)∈{circumflex over (κ)} is         then ω₁∨ω₂=((E₁∪E₂)→_(κ)(L₁∨L₂))∈{circumflex over (κ)} and         ω₁∧ω₂=((E₁∩E₂)→_(κ)(L₁∧L₂))∈{circumflex over (κ)}

{circumflex over (κ)} extends κ so that AKBs can deal with composite objects ω, associated with combinations of sets of evidences E in the knowledge base and combinations of rules in the knowledge base. Therefore, the embodiments utilize both composite evidences and composite rules to establish support for a target rule. Since κ is finite, so is {circumflex over (κ)}. Members of {circumflex over (κ)} are referred to as composite objects.

∪ and ∩ denote ‘set union’ and ‘set intersection’, respectively, while ∨ and ∧ denote ‘logical or’ and ‘logical and’, respectively.

Let κ be an AKB.

-   -   1. Let ω=(E→_(κ)L)∈{circumflex over (κ)}, then l(ω)=E and         r(ω)=L.     -   2. Let Ω⊆{circumflex over (κ)}·l(Ω)={l(ω)|ω∈Ω} and         r(Ω)={r(ω)|ω∈Ω}.     -   3. Let ω₁, ω₂∈{circumflex over (κ)}·ω₁=ω₂ if and only if         l(ω₁)=l(ω₂) and r(ω₁)≡r(ω₂).     -   4. Let ω₁, ω₂∈{circumflex over (κ)}·ω₁≤ω₂ if and only if         l(ω₁)⊆l(ω₂) and r(ω₁)⇒r(ω₂).

If ω is a composite object, l(ω) denotes a composite set of evidences associated with ω, and r(ω) denotes the composite rules associated with ω. This enable {circumflex over (κ)} to extract the rule portion and the associated set of evidences portion from ω, where ω represents a composite object of a plurality of E→A.

Let κ be an AKB, G⊆U and L∈

.

$G\overset{d}{\left. \rightarrow{}_{\kappa} \right.}L$ if and only if there exists ω∈{circumflex over (κ)} is such that l(ω)=G and r(ω)⇒L.

Let κ be an AKB and L∈

. Then Σ_(κ)(L)={ω∈{circumflex over (κ)}|r(ω)⇒L}, Σ_(κ) ^(∨)(L)=∨_(ω∈Σ) _(κ) _((L))ω, η_(κ)(L)=r(Σ_(κ) ^(∨)(L)), σ_(κ)(L)=l(Σ_(κ) ^(∨)(L)), and σ _(l)(L)=[σ_(κ)(L′)]′.

σ_(κ)(L) is the support of L wrt κ, while σ _(κ)(L) is the plausibility of L wrt κ.

Various algorithms for determining σ_(κ)(F), some polynomial and some non-polynomial may be provided.

Let κ₁ and κ₂ be AKBs. κ₁ and κ₂ are equivalent if and only if for all L∈

, σ_(κ) ₁ (L)=σ_(κ) ₂ (L).

Let κ be an AKB. κ is consistent if and only if G=∅, subject to

_(κ), whenever

$G\overset{d}{\left. \rightarrow{}_{\kappa} \right.}{F.}$

In general, consistency imposed certain conditions that the E's must satisfy.

Let κ be an AKB, G⊆U and L∈

.

$G\overset{i}{\left. \rightarrow{}_{\kappa} \right.}L$ if and only it mere exists L₀∈

such that L⇒L₀ and G→_(κ)L₀.

Let κ be an AKB and L∈

. Φ_(κ)(L)={ω∈{circumflex over (κ)}|L⇒r(ω)}.

Let κ be an AKB and L∈

. Then ϕ_(κ)(L)=∩_(ω∈Φ) _(κ) _((L))l(ω), subject to

_(κ), and ϕ _(κ)(L)=[ϕ_(κ)(L′)]′.

Let κ be an AKB.

-   -   1. Let ω∈{circumflex over (κ)}. ω′ is defined recursively as         follows:         -   a) If ω=(E→A) where ω∈κ, then ω′=(E′→A′).         -   b) If ω₁, ω₂∈{circumflex over (κ)}, then (ω₁∨ω₂)′=ω₁′∧ω₂′             and (ω₁∧ω₂)′=ω₁′∨ω₂′.     -   2. Let Ω⊆{circumflex over (κ)}. Ω={ω′|ω∈Ω}.     -   3. Let τ⊆κ·τ={ω′|ω∈τ}. When κ is viewed as an AKB, then         _(κ) =         _(κ).

Let κ be an AKB and L∈

.

${\phi_{\kappa}(L)} = {{\underset{\_}{\sigma}}_{\underset{\_}{\kappa}}(L)}$ and ϕ _(κ)(L)=σ _(κ) (L).

Let κ₁ and κ₂ be AKBs. κ₁ and κ₂ are i-equivalent if and only if for all L∈

, ϕ_(κ) ₁ (L)=ϕκ₂ ₂ (L).

Let κ be an AKB. κ is i-consistent if and only if G=U, subject to

_(κ), whenever

$G\overset{i}{\left. \rightarrow{}_{\kappa} \right.}{T.}$

An AKB κ is disjunctive if and only if every (E→A)∈κ, A is a conjunction of disjunctions of atomic propositions.

It is assumed that

_(κ) is the smallest collection of atomic propositions that satisfies the above definition.

Let κ be an AKB. κ is irreducible if and only if κ satisfies the following conditions:

-   -   1. For every ω∈κ, l(ω)≠∅ and r(ω)≠T.     -   2. κ is disjunctive.     -   3. For every ω₁, ω₂∈κ, if r(ω₁)⇒r(ω₂), then l(ω₁)⊇l(ω₂).

Let κ be an AKB. If ω∈{circumflex over (κ)}, then ρ(ω) is the collection of all atomic propositions that occur in r(ω). Moreover, Let ω₁, ω₂∈{circumflex over (κ)}. If both are atomic disjunction over κ, and there exists an atomic proposition P such that P occur in ω₁ and P′ occur in ω₂, then ω₁⋄ω₂ merges ω₁ and ω₂ where a single occurrence of P and P′ are removed.

Let

⊆

_(a), the collection of all atomic propositions in

.

-   -   1.         ={P′|P∈P} and         =         ∪         .     -   2.         ^(∨) is the collection of all         P where         ⊆         . For completeness,         P=F if         =∅.     -   3.         ^(∧) is the collection of all         P where         ⊆         . For completeness,         P=T if         =∅.     -   4.         is the smallest collection of         , containing         and closed under the operations: ′ (not), ∧ (and), and ∨ (or).

It can be shown that

⊆

⊆

^(∨)⊆

and

⊆

⊆

^(∧)⊆

.

Let

⊆

_(a).

is simple if and only if for every P∈

_(a), not both P and P′ are in

.

Let L∈

. L is ∨-simple if and only if for some simple

⊆

_(a), L=

P.

Let L∈

. L is ∧-simple if and only if for some simple

⊆

_(a), L=

P.

In what follows, without loss of generality, we shall assume that L is ∨-simple whenever L∈

_(a) ^(∨), and L is ∨-simple whenever L∈

_(a) ^(∧).

Disjunctive AKB is introduced in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference and an algorithm to transform any AKB to disjunctive AKB was given there. If κ is an AKB, then κ^(D) denotes the corresponding disjunctive AKB

Let κ be an AKB

-   -   1. κ is disjunctive if and only if A∈         _(a) ^(∨) for every (E→A)∈κ.     -   2. κ is conjunctive if and only if A∈         _(a) ^(∧) for every (E→A)∈κ.     -   3.         _(k) is the collection of all atomic propositions that appear in         κ.

Algorithm (Disjunctive AKB) Given an AKB κ. Return k₀.

-   -   1. Let κ₀=∅.     -   2. For each ω∈κ, do the following:     -   3. Express r(ω) in CNF over κ, i.e., r(ω)=L₁∧L₂∧ . . . ∧L_(k),         where L_(i) is ∨-simple for each i=1, 2, . . . , k.     -   4. For each i=1, 2, . . . , k, add to κ₀ the object l(ω)→L_(i).     -   5. Return k₀.

Let κ be a disjunctive AKB.

-   -   1. Ω _(κ)={E→A|E⊆U, A∈         _(κ)}.     -   2. Ω̆_(κ)={E→A|E⊆U, A∈         _(κ) ^(∨)}.     -   3. {tilde over (Ω)}_(κ)={E→A|E⊆U, A∈         _(κ)}.

Let ∅=Ω⊆{tilde over (Ω)}_(κ). Then ∧_(w∈Ω)ω=(U→T) and ∨_(w∈Ω)ω=(∅→F). Let κ be a irreducible AKB, ω₁, ω₂∈Ω̆_(κ) and P∈

_(κ). P is complemented with respect to ω₁ and ω₂ if and only if either P∈ρ(ω₁) and P′∈ρ(ω₂), or P′∈ρ(ω₁) and P∈ρ(ω₂).

Let κ be a disjunctive AKB, ω₁, ω₂∈Ω̆_(κ) and P∈

_(κ). P is complemented with respect to ω₁ and ω₂ if and only if either P∈ρ(ω₁) and P′∈ρ(ω₂), or P′∈ρ(ω₁) and P∈ρ(ω₂).

Let κ be a disjunctive AKB and ω₁, ω₂∈Ω̆_(κ).

-   -   1. c(ω₁, ω₂) is the collection of all P∈         _(κ) where P is complemented with respect to ω₁ and ω₂.     -   1. ω₁≈ω₂ if and only if |c(ω₁, ω₂)|=0, i.e., c(ω₁, ω₂)=∅.     -   2. Let P∈         _(κ).     -   4. ω₁˜^(P) ω₂ if and only if P∈ρ(ω₁), P′∈ρ(ω₂) and ω₁/P≈ω₂/P′.     -   5. ω₁≃^(P) ω₂ if and only if P∈ρ(ω₁), P′∈ρ(ω₂) and ω₁/P=ω₂/P′.     -   6. ω₁˜ω₂ if and only if ω₁˜^(P) ω₂ for some P∈         _(κ). In other words, ω₁˜ω₂ if and only if |c(ω₁, ω₂)|=1.     -   7. ω₁≃ω₂ if and only if ω₁≃^(P) ω₂ for some P∈         _(κ).     -   8. If ω₁˜^(P) ω₂ where P∈         _(k), then ω₁⋄ω₂=(l(ω₁)∩l(ω₂)→r(ω₁/P) ∨(ω₂/P′)). Otherwise,         ω₁⋄ω₂ is not defined. In other words, ω₁⋄ω₂ merges ω₁ and ω₂ and         removes both P and P′.

The ⋄ operator defined above is a powerful unification operator which is capable of handling uncertain knowledge in a natural manner. Therefore, it is a true generalization of the traditional unification operators.

Section 6. New Unification Algorithms and New Applications of Uncertainties

In what follows, it is assumed that κ is an AKB. AKB is further described in U.S. Pat. No. 9,275,333.

However, most of the time, instead of dealing with κ itself, we shall be dealing with κ^(D). κ^(D) is the disjunctive knowledge base associated with κ. Nevertheless, it is essential that κ be kept intact, especially if inductive reasoning is part of the solutions.

The unification algorithm below is different from related unification algorithms, including the unification algorithms discussed in U.S. Pat. No. 9,275,333, because according to an aspect of an embodiment, it is determined whether any atomic proposition A is selectable, and if there are any atomic propositions A, construct the two sets Q(A) and Q(A′) and then combine the elements in Q(A) with the elements in Q(A′) using the ⋄ operations. The results are added into the knowledge base while discarding the original elements in Q(A) and Q(A′).

Algorithm 6.1. (Unification). According to an aspect of an embodiment, unification refers to determining of the validity (in the case of deductive reasoning) of a target rule L. FIG. 3 is a flow chart of a unification process for the knowledge fragments, according to an embodiment. In FIG. 3, at 302, initialize a new knowledge base κ^(L) based on the original knowledge base κ and the target rule L. At 304, construct the sets from κ^(L) which contains propositions involving A and complements of propositions A′. At 306, check whether or not there are still any atomic proposition left. At 308, arrange the atomic propositions in a list according to how often they occur. At 310, check whether or not the list is empty. At 312 process the first element of the list. At 314, output δ_(κ)(L) whose union represents the validity of L.

More specifically, given an AKB κ, L∈

, output δ_(κ)(L), the collection of all l(ω) where ω∈{circumflex over (κ)} is and l(ω)=F, as follows:

-   -   1. Let κ^(L)=κ^(D), where κ^(L) is an new AKB and κ^(D) is the         disjunctive AKB associated with κ.     -   2. Rewrite L′ in disjunctive-conjunctive normal form. In other         words, L′=L₁∨L₂ . . . L_(n), where for every 1, 2, . . . , n,         L_(i) is a conjunction of atomic propositions. in         . According to an aspect of an embodiment, an atomic proposition         refers to a type of target rule that conveys the concept of a         proposition such that a proposition is not made up of other         propositions, for example A is an atomic proposition but A∨B is         not. Some propositions or rules are atomic propositions or         rules.     -   3. Add to κ^(L) all items of the form (U→L_(i)), where i=1, 2, .         . . , n.     -   4. Initialize the output δ_(κ)(L)=∅.     -   5. Let         consists of all distinct terms or atomic propositions that         appeared in κ^(L). The As in         are atomic propositions or atomic rules.     -   6. For each A∈         , associate the two lists of elements in κ^(L),         Q(A)={ω∈κ^(L)|A∈ρ(ω)} and Q(A′)={ω∈κ^(L)|A′∈ρ(ω)}.     -   7. Compute q(A)=Σ_(ω∈(Q(A)∪Q(A′)))ρ(ω) for all A∈         .     -   8. Remove from ω^(L) all ω∈Q(A)∪Q(A′) where q(A)=0.     -   9. Remove from         all A with q(A)=0.     -   10. If         =∅, go to Step 16.     -   11. Sort the A∈         in increasing order of q(A) to obtain the ordered list A₁, A₂ .         . . , A_(n).     -   12. If the ordered list given above is empty, go to Step 16.         Otherwise, select the first element from the list, say A.     -   13. Remove A from         .     -   14. For each ω₁∈Q(A) and ω₂∈Q(A′):         -   a) Construct ω₀=ω₁⋄ω₂.         -   b) Add l(ω₀ into γ_(κ)(L) if r(ω₀)=F. Otherwise, add ω₀ into             κ^(L).     -   15. Go back to Step 6 above and repeat the algorithm starting         from that step.     -   16. Output δ_(κ)(L), which is set of composite objects to derive         a validity value σ_(κ)(L) for L (discussed below) based upon a         union of the composite objects.

The above process may be modified for used as an Anytime-Anywhere algorithm. It does not require that κ^(L) be determined completely in advanced. It may be stopped anytime by executing Step 16, and continued at the place where it was stopped when more time are allocated. In addition, the algorithm can accept any addition to κ^(L) while continuing the unification process. Finally, other heuristics may be used instead of or in addition to Step 6 given in the algorithm to improve the results, Anytime-Anywhere or otherwise.

Given AKB κ and L∈

. σ_(κ)(L)=∪_(G∈δ) _(κ) _((L))G, which is a value indicative of validity.

In Algorithm 6.1 above, let L=F. Then, we obtained the value of σ_(κ)(F). Thus, the algorithm can determine whether κ is consistent or not. As a matter of fact, if E=∅ for every E∈δ_(κ)(F), then κ is consistent. Else, κ is not consistent.

Since for every L∈

, σ_(κ)(F)⊆σ_(κ)(L), the inconsistencies in κ will affect the unification process, including Algorithm 7.1. In the next section, we shall provide a new algorithm to perform unification while ignoring inconsistencies.

Due to the duality nature of deductively reasoning and inductive reasoning, the above unification method is applicable to both deductive reasoning and inductive reasoning. In the latter case, the AKB κ must be kept intact for constructing the κ. κ^(D) cannot be constructed from κ^(D), i.e., κ ^(E)≠κ^(D) . By the way, the unification method can be used with validity or plausibility measures.

Algorithm 6.1 is modified to introduce tags n to identify the sources of the results. In other words, the tags η(ω) indicate how the results were obtained for ω within the AKB. For example, if ω₁=(E₁→A′∨B), ω₂=(E₂→A) are in the knowledge base then the tag for ω=(E₁∧E₂→B) corresponds to ω₁ and ω₂. According to an embodiment, information indicating a tag is associated with ω.

Algorithm 6.2. (Unification with Tag).

FIG. 4 is flow chart of the unification process described in FIG. 3 with tags, according to an embodiment. At 402, 404, 406, 408, 410, and 414 is the same as in Algorithm 6.2. At 412, the tags are updated accordingly. δt_(κ)(L) is as the same as in Algorithm 6.2 but with an associated tag.

Given an AKB κ₁, κ₂, . . . , k_(n), L∈

, output δ_(κ)(L):

-   -   1. Transform each κ₁, κ₂, . . . , k_(n) into disjunctive AKBs,         and let κ=∪_(i=1) ^(n)k_(i).     -   2. For each ω∈κ, let η(ω)=∅.     -   3. For i=1 to n and each ω∈κ, add i to η(ω) if ω∈κ_(i).     -   4. Rewrite L′ in CNF over κ. In other words, L′=L₁∧L₂∧ . . .         ∧L_(m), where for every 1, 2, . . . , m, L_(i) is a disjunction         of atomic propositions in         .     -   5. Add to κ^(L) all items of the form ω_(j)=(U→L_(j)) with tag         η(ω_(j))=∅ where j=1, 2, . . . , m.     -   6. Let δt_(κ)(L)=∅.     -   7. Let         be the smallest subset of         _(a) such that         contains all r(ω) where ω∈κ.     -   8. Let         consists of all distinct terms that appeared in κ.     -   9. For each A∈         , associate the two lists, Q(A)={ω∈κ^(L)|A∈ρ(ω)} and         Q(A′)={ω∈κ^(L)|A′∈ρ(ω)}.     -   10. Compute q(A)=Σ_(ω∈(Q(A) ∪Q(A′)))|ρ(ω)| for all A∈         .     -   11. Remove from κ^(L) all ω∈Q(A)∪Q(A′) where q(A)=0.     -   12. Remove from         all A with q(A)=0.     -   13. If         =∅, go to Step 19.     -   14. Sort the A∈         in increasing order of q(A) to obtain the ordered list A₁, A₂ .         . . , A_(q).     -   15. If the ordered list given above is empty, go to Step 19.         Otherwise, select the first element from the list, say A.     -   16. Remove A from         .     -   17. For each ω₁∈Q(A) and ω₂∈Q(A′):         -   a) Construct ω₀=ω₁⋄ω₂.         -   b) If ω₀ is defined, do the following:             -   i. Let η(ω₀)=η₇(ω₁)∪η(ω₂).             -   ii. Add ω₀ together with η(ω_(o)) into δ_(κ)(L) if                 r(ω₀)=F. Otherwise, add ω₀ together with η(ω₀) into                 κ^(L).     -   18. Go back to Step 9 above and repeat the algorithm starting         from that step.     -   19. Output δt_(κ)(L).

Observe that although Algorithm 6.2 is similar to Algorithm 6.1, however, tags are created only in the former.

Algorithm 6.3. (Unification sans Inconsistencies).

FIG. 5 is a flow chart of a unification process for the knowledge fragments without inconsistencies, according to an embodiment. At 502, initialize κ^(L) based on the original knowledge base κ, κ^(M) based on the target rule L. 504, similar to 304 but the two sets are constructed from κ^(L) and κ^(M), respectively. 506, 508, and 510 are same as 306, 308 and 310. 512 is similar to 312 but the result is put back in κ^(M). At 514, output the result δi_(κ)(L) whose union represents the validity of L, with all uncertainties removed.

Given an AKB κ, L∈

, output δ_(κ)(L):

-   -   1. Transform k into a disjunctive AKB κ^(D), and let         κ^(L)=κ^(D).     -   2. Rewrite L′ in CNF over κ. In other words, L′=L₁∧L₂∧ . . .         ∧L_(n), where for every 1, 2, . . . , n, L_(i) is a disjunction         of atomic propositions in         .     -   3. Let κ^(M) consists all items of the form (U→L_(i)), where         i=1, 2, . . . , n.     -   4. Let δi_(κ)(L)=∅.     -   5. Let         ₀ be the smallest subset of         _(a) such that         ₀ contains all r(ω) where ω∈κ^(L)∪κ^(M); and         =         ₀.     -   6. For each A∈         , associate the two lists, Q(A)={ω∈κ^(L)|A∈ρ(ω)} and         Q(A′)={ω∈κ^(M)|A′∈ρ(ω)}.     -   7. Compute q(A)=Σ_(ω∈(Q(A)∪Q(A′)))|ρ(ω)| for all A∈         .     -   8. Remove from κ^(L) and from κ^(M) all ω∈Q(A)∪Q(A′) where         q(A)=0.     -   9. Remove from         all A with q(A)=0.     -   10. If         =0, go to Step 16.     -   11. Sort the A∈         in increasing order of q(A) to obtain the ordered list A₁, A₂         A_(n).     -   12. If the ordered list given above is empty, go to Step 3.         Otherwise, select the first element from the list, say A.     -   13. Remove A from         .     -   14. For each ω₁∈Q(A) and ω₂∈Q(A′):         -   a) Construct ω₀=ω₁⋄ω₂.         -   b) Add ω₀ into δ_(κ)(L) if r(ω₀)=F. Otherwise, add ω₀ into             κ^(M).     -   15. Go back to Step 6 above and repeat the algorithm starting         from that step.     -   16. Output δi_(κ)(L).

Let κ be a consistent AKB and L∈

. Then σ_(κ)(L)=∪_(ω∈δi) _(κ) _((L))l(ω), where δi_(κ)(L) is the output of Algorithm 6.3. Observe that in Algorithm 6.3, although inconsistencies may exist in κ and may not be removed, they were approximately ignored during the unification process. It follows from the fact that in this unification process, we build two separate knowledge bases, κ^(L) and κ^(M). Since κ^(M) started with the target, and it is always involved in the unification, therefore, any unifications produced by inconsistencies in the original knowledge base cannot happen. However, since the inconsistencies may permeate through the entire system, they may create many interesting properties and problems.

The decomposition of any ω∈{circumflex over (κ)} which will be needed in subsequent sections is examined.

Let κ be an AKB, ω∈{circumflex over (κ)} and ω_(i)∈κ for i=1, 2, . . . , n.

-   -   1. ω is conjunctive if and only if co=ω₁∧ω₂∧ . . .         ∧ω_(n)=∧_(i=1) ^(n)ω_(i). Moreover, ω is minimal if and only if         for i=1, 2, . . . , n and j=1, 2, . . . , n where i≠j,         ω_(i)≮ω_(j).     -   2. ω is disjunctive if and only if ω=ω₁∨ω₂∨ . . . ∨ω_(n)=∨_(i=1)         ^(n)ω_(i)ω₁. Moreover, ω is minimal if and only if for i=1, 2, .         . . , n and j=1, 2, . . . , n where i≠j, ω_(j)≮ω_(i).

Let κ be an AKB and ω∈{circumflex over (κ)}. Let ω_(i)∈κ for i=1, 2, . . . , n.

-   -   1. ω is in disjunctive conjunctive form if and only if ω=ω₁∨ω₂∨         . . . ∨ω_(n) where for each i=1, 2, . . . , n, ω_(i)∈{circumflex         over (κ)} is and is minimal conjunctive. Each of the ω_(i) is         called a conjunctive term. Moreover, ω is minimal if and only if         for i=1, 2, . . . , n and j=1, 2, . . . , n where i≠j,         ω_(i)≠ω_(j).     -   2. ω is in conjunctive disjunctive form if and only if ω=ω₁∧ω₂∧         . . . ∧ω_(n) where for each i=1, 2, . . . , n, ω_(i)∈{circumflex         over (κ)} and is minimal disjunctive. Each of the ω_(i) is         called a disjunctive term. Moreover, ω is minimal if and only if         for i=1, 2, . . . , n and j=1, 2, . . . , n where i≠j,         ω_(i)≠ω_(j).

Let κ be an AKB. Every ω∈{circumflex over (κ)} can be put in minimal disjunctive-conjunctive form, as well as in minimal conjunctive-disjunctive form.

In view of the above Proposition, for simplicity and uniformity, unless otherwise stated, we shall assume that all ω∈{circumflex over (κ)} are in minimal disjunctive-conjunctive forms. Of course, we could have chosen minimal conjunctive-disjunctive forms

Let κ be an AKB and ω₁, ω₂∈{circumflex over (κ)} is are expressed in minimal disjunctive-conjunctive forms.

-   -   1. ω₁⊆ω₂ if and only if every conjunctive term in ω₁ is in ω₂.     -   2. ω₁∩ω₂ is the minimal disjunctive-conjunctive form which         includes all the conjunctive terms that appeared in both ω₁ and         ω₂.     -   3. ω₁∪ω₂ is the minimal disjunctive-conjunctive form which         includes all the conjunctive terms that either appeared in ω₁ or         ω₂.     -   4. ω₁−ω₂ is the minimal disjunctive-conjunctive form obtained by         removing all the conjunctive terms that appeared in ω₂ from ω₁.

The same notations apply to minimal conjunctive-disjunctive forms.

The following Decomposition Rules are discussed in U.S. Pat. No. 9,275,333 the content of which is incorporated herein by reference.

Let κ be an AKB and L₁, L₂∈

.

-   -   1. σ_(κ)(L₁)∪σ_(κ)(L₂)⊆σ_(κ)(L₁∨L₂).     -   2. σ_(κ)(L₁)∩σ_(κ)(L₂)=σ_(κ)(L₁∨L₂).     -   3. σ _(κ)(L₁∨L₂)⊆σ _(κ)(L₁)∩σ _(κ)(L₂).     -   4. σ _(κ)(L₁∨L₂)=σ _(κ)(L₁)∪σ _(κ)(L₂).

Below, new extensions are provided of these rules:

Let κ be an AKB, and L₁, L₂∈

.

-   -   1. If L₁∧L₂=F, then σ_(κ)(L₁∨L₂)=σ_(κ)(L₁)∪σ_(κ)(L₂).     -   2. If L₁∨L₂=T, then σ _(κ)(L₁∧L₂)=σ _(κ)(L₁)∩σ _(κ)(L₂).     -   3. If κ is consistent, then σ_(κ)(L₁∨L₂)=σ_(κ)(L₁)∪σ_(κ)(L₂).     -   4. If κ is consistent, then σ _(κ)(L₁∧L₂)=σ _(κ)(L₁)∩σ_(κ)(L₂).

Let κ be an AKB and ω∈{circumflex over (κ)}. If ω is expressed in DNF over κ, where ω=ω₁∨ω₂∨ . . . ∨ω_(n); then {grave over (ω)} is obtain from ω by removing all conjunctions ω_(i) from ω where r(ω_(i))⇔F and l(ω_(i))≠∅. {grave over (ω)} is called the purified ω. Moreover, {grave over (κ)}={{grave over (ω)}|ω∈{circumflex over (κ)}}.

Let κ be an AKB and L∈L. Then {grave over (σ)}_(κ)(L)=∪_(ω∈δi) _(κ) _((L))l(ω), where δi_(κ)(L) is the output of Algorithm 6.3.

Since σ_(κ)(L)={grave over (σ)}_(κ)(L) if κ is consistent, therefore, in this case, Algorithm 6.3 can be used to compute σ_(κ)(L). Observe that Algorithm 6.3 is a faster unification algorithm than Algorithm 6.1.

Let κ be an AKB and L∈L.

-   -   1. {grave over (Σ)}_(κ)(L)={ω∈{grave over (κ)}|r(ω)⇒L}, {grave         over (Σ)}_(κ) ^(∨)(L)=∨_(ω∈{grave over (Σ)}) _(κ) _((L))ω,         {grave over (σ)}_(a)(L)=l({grave over (Σ)}_(κ) ^(∨)(L)) and         {grave over (η)}_(κ)(L)=r({grave over (Σ)}_(κ) ^(∨)(L)).     -   2. {grave over (Φ)}_(κ)(L)={ω∈{grave over (κ)}|L⇒r(ω)}, {grave         over (Φ)}_(κ) ^(∧)(L)=∧_(ω∈{grave over (Φ)}) _(κ) _((L))ω,         {grave over (ϕ)}_(κ)(L)=l({grave over (Φ)}_(κ) ^(∧)(L)) and         {grave over (ζ)}_(κ)(L)=r({grave over (Φ)}_(κ) ^(∧)(L)).

From the definition, it follows that {grave over (Σ)}_(κ) ^(∨)(L)∈{grave over (Σ)}_(κ)(L) and {grave over (Φ)}_(κ) ^(∧)(L)∈{grave over (Φ)}_(κ)(L).

Let κ be an AKB. κ is d-consistent if and only if {grave over (σ)}_(κ)(L)=σ_(κ)(L) for every L∈L.

The decomposition rules for {grave over (σ)}_(κ) and {grave over (σ)} _(κ) are given below: Let κ be an AKB and L₁, L₂∈

.

-   -   1. {grave over (σ)}_(κ)(L₁)∪{grave over (σ)}_(κ)(L₂)={grave over         (σ)}_(κ)(L₁∨L₂).     -   2. {grave over (σ)}_(κ)(L₁)∩{grave over (σ)}_(κ)(L₂)={grave over         (σ)}_(κ)(L₁∧L₂).     -   3. {grave over (σ)} _(κ)(L₁∧L₂)={grave over (σ)} _(κ)(L₁)∩{grave         over (σ)} _(κ)(L₂).     -   4. {grave over (σ)} _(κ)(L₁∨L₂)={grave over (σ)} _(κ)(L₁)∪{grave         over (σ)} _(κ)(L₂).

Section 7. Augmented Projection/Forecasting and Augmented Abduction

Deductive reasoning and inductive reasoning have been used to determine the validity and/or plausibility of any given proposition. In many applications, the desired proposition is not known. Instead, one is given some target propositions to determine the target propositions which can be deduced or induced from the AKB using these target propositions.

Projection and forecasting were used in Subsection on Information and Knowledge for the management and supervision of the evolution of the IKs, and abduction used for Analysis and Deep Understanding of IKs. We shall consider projection/forecasting and abduction in more general settings. We shall refer to the latter as Augmented Projection/Forecasting and Augmented Abduction—these form the basis for the Knowledge Augmenter (KA).

First, we shall introduce some new concepts and results. Moreover, we shall use them to design and develop various solutions for solving the Projection/Forecasting and Abduction problems, under different circumstances.

Let κ be an AKB, and L, L₁, L₂∈

.

-   -   1. L₂ is a κ-deductive-consequent, or κ-DC, of L₁ if and only if         σ_(κ)(L₁⇒L₂)≠∅. The collection of all κ-DC of L will be denoted         by DC_(κ)(L). In other words, DC_(κ)(L) consists of all L₀∈         which can be deductively projected/forecast from L using the         validity measures.     -   2. L₁ is a κ-deductive-antecedent, or κ-DA, of L₂ if and only if         σ_(κ)(L₁⇒L₂)≠∅. The collection of all κ-DA of L will be denoted         by DA_(κ)(L). In other words, DA_(κ)(L) consists of all L₀∈         which can be deductively abducted from L using the validity         measures.     -   3. L₂ is a κ-inductive-consequent, or κ-IC, of L₁ if and only if         ϕ _(κ)(L₁⇒L₂)≠∅. The collection of all κ-IC of L will be denoted         by IC_(κ)(L). In other words, IC_(κ)(L) consists of all L₀∈         which can be inductively projected/forecast from L using the         validity measures.     -   4. L₁ is a κ-inductive-antecedent, or κ-IA, of L₂ if and only if         ϕ _(κ)(L₁⇒L₂)≠∅.

The collection of all κ-IA of L will be denoted by IA_(κ)(L). In other words, IA_(κ)(L) consists of all L₀∈

which can be inductively abducted from L using the validity measures.

DC and IC may be viewed as projection/forecasting of the given proposition using deductive and inductive reasoning, respectively. In other words, DC is deductive projection/forecasting while IC is inductive projection/forecasting. On the other hand, DA and IA may be viewed as abduction of the given proposition using deductive and inductive reasoning, respectively. In other words, DA is deductive abduction while IA is inductive abduction. Existing abductive reasoning discussed by Eugene Charniak and Drew McDermott, “Introduction to Artificial Intelligence,” Addison-Wesley, 1985, corresponds to deductive abduction given above.

Let κ be an AKB and L∈

.

-   -   1. IC_(κ)(L)=DC _(κ) (L).     -   2. IA_(κ)(L)=DA _(κ) (L).

Let κ be an AKB and L∈

.

-   -   1. If κ₁ and κ₂ are equivalent, then DC_(κ) ₁ (L)=DC_(κ) ₂ (L)         and DA_(κ) ₁ (L)=DA_(κ) ₂ (L).     -   2. If κ₁ and κ₂ are i-equivalent, then IC_(κ) ₁ (L)=IC_(κ) ₂ (L)         and IA_(κ) ₁ (L)=IA_(κ) ₂ (L).

DC and DA are concentrated upon. All the results obtained for DC and DA can then be transformed into IC and IA, respectively, by replacing k with κ.

-   -   1. Let L₀∈DC_(κ)(L). L₀ is D-maximal wrt L if and only if for         every L₁∈DC_(κ)(L), L₁⇒L₀ implies σ_(κ)(L₁⇒L)⊆σ_(κ)(L₀⇒L).     -   2. Let L₀∈DA_(κ)(L). L₀ is D-minimal wrt L if and only if for         every L₁∈DA_(κ)(L), L₀⇒L₁ implies σ_(κ)(L₀⇒L)⊆σ_(κ)(L₁⇒L).

In general, construction of dC_(κ)(L) and dA_(κ)(L) can be done by adding U→A′ and U→A, respectively, and then performing unification.

Algorithm 7.1. Projection/Forecasting (Construction of dC_(κ)(P))

FIG. 6 is flow chart of determining projection/forecasting based upon the knowledge fragments, according to an embodiment. At 602, initialize 3 new knowledge bases. At 604, initialize another new knowledge base Σ₃. At 606, apply unification unify the elements in Σ₁ and Σ₂, and place the results in Σ₃. At 608, check whether or not Σ₃ is empty. At 610, update Σ₀ and Σ₂ and remove certain elements from Σ₀. At 612, output Σ₀, the results of Projection/Forecasting on P.

Given AKB κ, an atomic proposition P∈

, output Σ₀.

-   -   1. Transform k into an irreducible AKB.     -   2. Let Σp₀={ω∈κ|P′∈ρ(ω)}.     -   3. Let Σp₁=κ−Σ₀.     -   4. Let Σp₂=Σp₀.     -   5. Let Σp₃=∅.     -   6. Repeat the following for each ω₁∈Σp₁ and ω₂∈Σp₂.     -   7. Let ω₃=ω₁⋄ω₂.     -   8. If ω₃ is undefined, ignore ω₃. Else add ω₃ to Σp₃.     -   9. If Σp₃=∅, return Σp₀.     -   10. Let Σp₀=Σp₀∪Σp₃ and Σp₂=Σp₃.     -   11. Remove from Σp₀ all elements of Σp₀ that is not D-maximal         wrt L.     -   12. Go back to Step 5.

Algorithm 7.2. Abduction (Construction of dA_(κ)(P))

FIG. 7 is a flow chart of determining abduction for the knowledge fragments, according to an embodiment. All blocks 702, 70 706, 708, 710 and 712, are the same as 602 to 612 except the initialization of Σ₀ uses P instead of P′, and 710 uses D-minimal instead of D-maximal, and 712, output the results of abduction on P.

Given AKB κ, an atomic proposition P∈

, output Σa₀.

-   -   1. Transform k into an irreducible AKB.     -   2. Let Σa₀={ω∈κ|P∈ρ(ω)}.     -   3. Let Σa₁=κ−Σa₀.     -   4. Let Σa₂=Σa₀.     -   5. Let Σa₃=∅.     -   6. Repeat the following for each ω₁∈Σa₁ and ω₂∈Σa₂.     -   7. Let ω₃=ω₁⋄ω₂.     -   8. If ω₃ is undefined, ignore ω₃. Else add ω₃ to Σa₃.     -   9. If Σa₃=∅, return Σa₀.     -   10. Let Σa₀=Σa₀∪Σa₃ and Σa₂=Σa₃.     -   11. Remove from Σa₀ all elements of Σa₀ that is not D-minimal         wrt L.     -   12. Go back to Step 5.

Let κ be an AKB and L∈

. By expressing L as disjunction of conjunctions or conjunction of disjunctions, the Algorithms and results given above provide a complete solution for the construction of dC_(κ)(L) and dA_(κ)(L), i.e., projection/forecasting and abduction, respectively.

If κ is endow with a probabilistic measure m together with its extension {tilde over (m)}, then dC_(κ)(L) and/or dA_(κ)(L) can be sorted in order of {tilde over (m)}(ω) for co in dC_(κ)(L) and/or dA_(κ)(L). In this manner, the “best” or “most probable” projection/forecasting or abduction can be found at the top of the corresponding sorted lists. The remaining members of the sorted lists provide additional alternatives for projection/forecasting or abduction. A threshold for projection/forecasting and/or abduction can be enforced by removing all the members of the sorted lists according to certain specified value.

iC_(κ)(L) and iA_(κ)(L) can also be defined and constructed in a similar manner using κ.

This Section shows how projection/forecasting and/or abduction can be realized given any propositions using either deductive or inductive reasoning based on the validity measures. All the above concepts and results can be carried over if plausibility measures are used. Moreover, if the AKB is consistent, then the use of plausibility measures could provide a wider range of projection/forecasting and/or abduction.

Given L∈

, general methods for computing dC_(κ)(L), dA_(κ)(L), iC_(κ)(L) and iA_(κ)(L) are shown above. In other words, they provide a general solution for the projection/forecasting, as well as, the abduction problems.

Now consider L∈

, if we are interested in projection/forecasting with respect to L, we can first determine dA_(κ)(L), and then compute dC_(κ)(dC_(κ)(L)). The result represents the projection/forecasting of the possible sources which gave rise to L.

Let κ be an AKB and L∈

. Then dC_(κ)(L)⊆dC_(κ)(dC_(κ)(L)).

The dC_(κ)(dC_(κ)(L)) given above may be viewed as a second order projection/forecasting with respect to L. Obviously, we can continue in this fashion to obtain higher order projections/forecasting.

If we replace dC by dA, then we have higher order abductions if necessary. These are done deductively and with validity measures. In the same manner, we may consider higher order projections/forecasting and abductions done deductively and/or inductively with either the validity measures and/or plausibility measures. These could provide much wider ranges of results compare to first-order projections/forecasting and abductions alone, which were discussed earlier in this sections.

Section 8. A-Exploration

A-Exploration is an overall framework to accommodate and deal with the efforts and challenges mentioned above. It is intended to provide the capabilities to organize and understand, including the ramification of the IKs captured in real-time, as well as to manage, utilize and oversee these IKs to serve the intended users. Management of the IKs may include providing the necessary measures, whenever possible, to guide or redirect the courses of actions of the IKs, for the betterment of the users.

A. Introduction

As stated above, an important component of A-Exploration is the facilities to transform the representation of information fragments acquired by LRIRE into more robust and flexible representations—knowledge fragments. The various transformations, carried out in A-Exploration, are determined in such a way that it can optimally accomplish its tasks; e.g., transformation of CGs into CBKFs, CGs into AKFs, CBKFs into AKFs, AKFs into CBKFs, etc.

The transformation from CGs into CBKFs requires that we have the necessary knowledge about the relationships involving concepts and features. The requisite knowledge is normally available in semantic networks, word-net, etc. Although not necessary, due to the nature of CBKB, addition of probabilities will make the results of the transformation more precise. These probabilities could be specified as part of the semantic networks, etc. to represent the strength of the relationships. If no probabilities are specified, we may assume that they are 1. In any case, the probabilities given have to be adjusted when they are used in the CBKB to guarantee that they satisfy the requirements of the CBKB. The process of transforming DGs into CBKFs, including modification of the probabilities, can be easily accomplished and automated.

The main advantage of knowledge fragments, such as, CBKFs and AKFs, is the fact that they are parts of knowledge bases, and therefore are amenable to reasoning. This allows the possibility of pursuing and engaging in the various functions listed above, including the understanding and projecting the directions of further progression of the IKs, to foresee how they may influence other subjects and/or areas. Ways of accomplishing the functionalities listed above are discussed in more details below.

Methods and algorithms, introduced and/or presented in CBKBs and AKBs, are modified and tailored to analyze and understand the IKs. In particular, by identifying the pertinent inference graphs in a CBKB, it could provide the means to narrow down the analysis and supply the vehicle to explore deeper understanding of the IKs.

For CBKBs and AKBs, full analyses of the IKs can be performed deductively and/or abductively; while for AKBs, we can also perform the analyses inductively. These allow the presentations of a wider range of possibilities for understanding, comprehending and appreciating the ongoing evolution or progression of the IKs.

B. Comprehension, Analysis and Deep Understanding of IKs

Analyses of the IKs require deep understanding of many aspects of the IKs. It is backward looking and entails finding the best explanations for the various scenarios of the IKs. In other words, abduction is the key to analyses. Since both CBKBs and AKBs permit abduction, they can provide the instruments to analyze and better understand the IKs. This forms the basis for the Deep Comprehender subcomponent of the Augmented Analyzer (AA).

Abductive reasoning can be used in both CBKBs and AKBs with deductive reasoning. This permits a better understanding of the IKs and can provide the explanations of the possible courses/paths of how the IKs had evolved or progressed. Moreover, for AKB, abduction can be coupled not only with deduction (that is what existing abduction is all about), but can also be coupled with induction, and thereby allowing awareness and comprehension of suitable/fitting developments, generalizations and/or expansions of the desired IKs. For both CBKBs and AKBs, they can provide the following functionalities for A-Exploration:

-   -   1. Identify the Organization and Chronology of the Information         and Knowledge.     -   2. Comprehend and Understand the Various Occurrences and         Happenings.     -   3. Resolve and Establish the Reasons and Explanations for the         Origins and Causes of the Various Incidences and Manifestations.     -   4. Determine the Causal Effects and Relationships of the         Information and Knowledge.     -   5. Cognizant and Knowledge of Areas Related or Linked to the         Information and Knowledge.

Since the power of abduction is the understanding and explanation of the IKs, resolutions of items above, which are related to Item 2 in the list of the functionalities of A-Exploration, can be achieved. In general, by exploiting and customizing (deductive or inductive) abduction, it may be possible to uncover the likely solutions.

C. Manage and Supervise the Evolution of the IKs

When employing only deductive inferencing, the results provided by the above procedures will be referred to as deductive prediction, projection, and/or forecasting. The same methods are equally applicable to inductive inference which is available for AKBs. Inductive prediction, projection and/or forecasting present much larger opportunities to produce more diverse results, which may be more informative and valuable to the question at hand.

Both CBKBs and AKBs have the capabilities of exploring for the desired/ideal alternative outcomes by management and supervising the evolution through user inputs taken into consideration through the user specifications spec₁, . . . spec_(m). Thus, they will permit A-Exploration to offer solutions to various functionalities such as those given below:

-   -   1. Appreciation and Conscious of the Likely Influences from         External Subjects or Areas.     -   2. Awareness and Recognition of the Ramifications.     -   3. Verification and Anticipation of the Possible Spread and         Impact to Other Areas.     -   4. Realization and Management of the Plausible Consequences.     -   5. Project, Predict and Forecast the Effects and Consequences of         the Various Actions and Activities.     -   6. Instigate Measures to Supervise and Regulate the Information         and Knowledge.     -   7. Initiate Credible Actions to Mitigate and/or Redirect the         Plausible Effects and Outcomes of the Information and Knowledge.

Moreover, we can also customize and/or adapt various methods and algorithms, available for CBKBs and/or AKBs, to predict, project and/or forecast future directions of the IKs. This forms the basis for the Explorer of Alternative Outcomes subcomponent of the Augmented Analyzer (AA).

With CBKBs, predicting, projecting and/or forecasting based on knowledge of established or proven IKs (complete or partial) can be done using existing mechanisms available in any CBKBs. Since prediction, projection and/or forecasting are forward looking, the usual inference mechanisms can be exploited for that purpose. In CBKBs, this means:

-   -   1. Extraction of the essential knowledge from the knowledge         bases in question.     -   2. Forms the inference graph(s).     -   3. Perform basic inferencing.

If more than one inference graphs are available, the results can be sorted according to the values of the inference graphs. In this case, different scenarios and their possible outcomes may be offered to the intended users. When user input is taken into consideration, this forms the basis for the Augmented Supervisor (AS). The user input can be menu-based inputs and/or term-based inputs and/or natural language-based inputs and/or database inputs.

D. Proprietary Hypothesis Plug-Ins (PHP)

Most of the basic information and knowledge involved in the creation of the knowledge bases, such as AKB, CBKB, etc. are available in the public domain. Additional up-to-date information can be obtained using the LRIRE given in Section 4. These information and knowledge make up the bulk of the desired AKB, CBKB and other knowledge bases.

The requisite knowledge may also be available as proprietary hypothesis plug-ins, which allows the construction of proprietary knowledge bases, including AKFs, CBKFs, etc. A specific example of a proprietary hypothesis plug-in is the relation between a process and its efficacy, or a drug and how it is connected to certain illnesses.

Proprietary information (including patents and other intellectual properties, private notes and communications, etc.) form the backbone of most businesses or enterprises, and virtually all companies maintained certain proprietary information and knowledge. These can be used as proprietary hypothesis plug-ins. A collection of hypothesis plug-ins may be kept in proprietarily constructed semantic networks, AKBs, CBKBs, etc. Part or all of this collection may be made available under controlled permitted or authorized limited access when constructing the desired AKBs, CBKBs, etc. For large collection, LRIRE, or some simplified form of LRIRE, can be used to automate the selection of the relevant information and/or knowledge.

The availability of proprietary hypothesis plug-ins provides the owners with additional insights and knowledge not known to the outside world. It could offer the owners more opportunities to explore other possible outcomes to their exclusive advantages.

The available plug-ins could supply the missing portions in our exploration of the desired/ideal alternative outcomes. By the way, proprietary hypothesis plug-ins need not be comprised of only proven or established proprietary knowledge. They may contain interim/preliminary results, conjectures, suppositions, provisional experimental outcomes, etc. As stated above, when using AKBs, CBKBs, etc. to house the proprietary hypothesis plug-ins, the unproven items can be signified by specifying a lower probabilities and/or reliabilities.

Clearly, this collection of hypothesis plug-ins can grow as more proprietary information and knowledge, including intellectual properties, is accumulated. It could become one of the most valuable resources of the company.

E. Wrapping-Up

CBKBs and AKBs have the capabilities of inducing and/or exploring additional desired/ideal alternative outcomes for the IKs. This can be achieved by augmenting the knowledge bases with temporary, non-permanent and/or transitory knowledge fragments to the CBKBs or AKBs. The optional knowledge could consist of hypotheses generated using interim/partial/provisional results, conjectures, unproven or not completely proven outcomes, or simply guess works. To maintain the integrity of the knowledge bases, the validity and/or reliability of the added knowledge should be associated with lower probabilities and/or reliabilities. Or the non-permanent knowledge should be held in separate knowledge bases. At any rate, any unsubstantiated hypotheses or temporary items not deemed feasible or useful should be removed promptly from the knowledge bases. In cases where there are multiple alternative outcomes, they can be sorted so the intended users can select the desired options.

Section 11, below on Conjectures and Scientific Discoveries explores desired/ideal alternative outcomes, in connection with finding the “missing links” in scientific discoveries.

Due to the richness of the structures of CBKBs and AKBs, depending on the problem at hand, one could initiate and/or institute novel approaches and/or techniques to accomplish its goals.

The potential that CBKBs and AKBs are capable of handling and managing the above requirements and conditions is the main reason we have chosen CBKBs and/or AKBs to play a central role in A-Exploration.

Both CBKBs and AKBs can be used in A-Exploration to achieve the intended goals, especially if we are interested in deductive reasoning. However, CBKBs are more visual and allow the users to picture the possible scenarios and outcomes. On the other hand, though AKBs are more powerful, they are also more logical. Theoretically, anything that can be accomplished using CBKBs can be accomplished using AKBs, and more. Indeed, depending on the problems and/or IKs involved, it may be advantageous to formulate the problems using either or both CBKBs and AKBs, and to allow the switching from one formulation to the other, and vice versa. With the type of information/knowledge considered in A-Exploration, it is not difficult to transform and switch from one formulation to the other, and vice versa.

It is possible to have multiple A-Exploration systems, each having its own objectives. These A-Exploration systems may then be used to build larger systems in hierarchical and/or other fashions, depending on their needs. Clearly, these systems can cross-pollinate to enhance their overall effects. Various subsystems of different A-Exploration systems may be combined to optimize their functions, e.g. A²DR.

Section 10. Emergent Events and Behaviors—subcomponent Emergence Detector of Augmented Analyzer (AA)

In this Section, we shall examine emergent behaviors, see discussion by Timothy O'Connor, Hong Yu Wong, and Edward N. Zalta, “Emergent Properties,” The Stanford Encyclopedia of Philosophy, 2012. They arise in a complex system mainly due to the interactions among the subsystems. Emergent behaviors are behaviors of the complex systems not manifested by the individual subsystems. When dealing with complex system, usually, emergent behaviors are generally unexpected and hard to predict. In this invention, we develop a formal characterization of emergent behavior and provide a means to quickly verify whether a behavior is emergent or not, especially with respect to AKBs. However, we shall first introduce a new object—consistent event and make use of a new method for unification without inconsistencies.

Consistent AKBs are defined as follows: Let κ be an AKB. κ is consistent if and only if G=∅, subject to

_(k), whenever

$G\overset{d}{\left. \rightarrow{}_{\kappa} \right.}{F.}$

A new object—κ-consistent event is introduced herewith is essential in dealing with emergent behaviors, etc.: Let κ be an AKB and ω∈{circumflex over (κ)}. ω is a κ-consistent event if and only if for every ω₀∈{circumflex over (κ)}, if r(ω₀)=r(ω), then l(ω₀)=l(ω).

Let κ be an AKB and ω∈{circumflex over (κ)}. Let ω be expressed in minimal disjunctive-conjunctive form ω₁∨ω₂∨ . . . ∨ω_(n). Then {grave over (κ)}(ω) is obtain from ω by removing all conjunctions ω_(i) from ω where r(ω_(i))≡F and l(ω_(i))≠∅. Moreover, {grave over (κ)}={{grave over (κ)}(ω)|ω∈{circumflex over (κ)}}.

Let κ be an AKB and ω∈{circumflex over (κ)}. ω is a κ-consistent event if and only if ω∈{grave over (κ)}.

The objects can also be associated with consistent events. In this case, we have created many new objects, and to distinguish them from the old ones, we shall represent them by adding an accent to these new objects, such as {grave over (σ)} for σ, etc.

For consistent events, the decomposition rules become: Let κ be an AKB and L₁, L₂∈

.

-   -   1. {grave over (σ)}_(κ)(L₁)∪{grave over (σ)}_(κ)(L₂)={grave over         (σ)}_(κ)(L₁∨L₂).     -   2. {grave over (σ)}_(κ)(L₁)∩{grave over (σ)}_(κ)(L₂)={grave over         (σ)}_(κ)(L₁∧L₂).     -   3. {grave over (σ)} _(κ)(L₁∧L₂)={grave over (σ)} _(κ)(L₁)∩{grave         over (σ)} _(κ)(L₂).     -   4. {grave over (σ)} _(κ)(L₁∨L₂)={grave over (σ)} _(κ)(L₁)∪{grave         over (σ)} _(κ)(L₂).

Let κ be an AKB. κ is consistent if and only if {grave over (σ)}_(κ)(L)=σ_(κ)(L) for every L∈

.

Let κ be an AKB and L∈

. Then {grave over (σ)}_(κ)(L)=∪_(ω∈δ) _(κ) _((L))l(ω), where δ_(κ)(L) is the output of Algorithm 7.3.

Since σ_(κ)(L)={grave over (σ)}_(κ)(L), if κ is consistent, then in this case, Algorithm 7.3 can be used to compute σ_(κ)(L). Observe that Algorithm 7.3 is a faster unification algorithm than Algorithm 7.1.

Observe that in Algorithm 6.3, although inconsistencies may exist in κ and may not be removed, they were completely ignored in the unification process. However, since the inconsistencies may permeate through the entire system, they may create many interesting properties and complications.

In the above discussions, we introduced deductive reasoning and inductive reasoning, and used them to determine the validity and/or plausibility of any given proposition. However, the validity and plausibility of a given proposition may be compromised due to inconsistencies. Although there are related algorithms for transforming any AKB κ into a consistent AKB, we show in the previous section how to deal with inconsistencies directly without having to determine them in advance, as well as removing them first, i.e., by considering {grave over (κ)} instead of κ.

The related discussions involving AKBs dealt primarily with consistent AKBs. However, inconsistencies are an integral part of any AKBs involved with emergent behaviors. Various methods could be considered for removing the inconsistencies and constructing consistent AKBs to take their places. In this invention, we show how to deal with inconsistencies directly without having to determine them in advance, as well as removing them first. To avoid the complexities introduced by inconsistencies, we shall use {grave over (σ)} instead of σ, etc.

In the rest of the section, we shall assume that n>0 and κ₁, κ₂, . . . , k_(n) are AKBs. Moreover, k=∪_(i=1) ^(n)k_(i), τ=∪_(i=1) ^(n){circumflex over (κ)}_(i), and for L∈

, Σ(L)=∨_(i=1) ^(n)Σ_(κ) _(i) (L) and {grave over (Σ)}(L)=∨_(i=1) ^(n){grave over (Σ)}_(κ) _(i) (L). Moreover, σ(L)=∪_(i=1) ^(n)σ_(κ) _(i) , {grave over (τ)}={ω∈{grave over (κ)}|ω∈τ}, and {grave over (σ)}(L)=∪_(i=1) ^(n){grave over (σ)}_(κ) _(i) .

Clearly, τ⊆{circumflex over (κ)} and {grave over (Σ)}(L)⊆{grave over (Σ)}_(κ)(L). These differences provide a first step in the study of emergent events and behaviors. Moreover, due to the interaction among the various parts, inconsistencies invariably occurred when the parts are combined into the whole.

Let L∈L and ω∈{circumflex over (κ)}.

-   -   1. L is a d-valid κ-emergent behavior if and only if for all         i=1, 2, . . . , n, {grave over (σ)}_(κ) _(i) (L)⊂{grave over         (σ)}_(κ)(L).     -   2. L is a d-plausible κ-emergent behavior if and only if for all         i=1, 2, . . . , n, {grave over (σ)} _(κ) _(i) (L)⊂{grave over         (σ)} _(κ)(L).     -   3. L is a i-valid κ-emergent behavior if and only if for all         i=1, 2, . . . , n, {grave over (ϕ)} _(κ) _(i) (L)⊂{grave over         (ϕ)} _(κ)(L).     -   4. L is a i-plausible κ-emergent behavior if and only if for all         i=1, 2, . . . , n, {grave over (ϕ)}_(κ) _(i) (L)⊂{grave over         (ϕ)}_(κ)(L).     -   5. ω is a (d-valid, d-plausible, i-valid, i-plausible)         κ-emergent event if and only if r(ω) is a (d-valid, d-plausible,         i-valid, i-plausible) κ-emergent behavior.

Let L∈L and ω∈{circumflex over (κ)}. If L is a (d-valid, d-plausible, i-valid, i-palusible) κ-emergent behavior, then ({grave over (σ)}_(κ)(L)≠∅, {grave over (σ)} _(κ)(L)≠∅, {grave over (ϕ)}_(κ)(L)≠∅, {grave over (ϕ)} _(κ)(L)≠∅).

Example 1. Let κ₁={E₁→B} and κ₂={E₂→(B⇒C)}. If (E₁∩E₂)≠∅, then ((E₁∩E₂)→C is a d-valid κ-emergent event, and C is a d-valid κ-emergent behavior.

Let L∈L. L is a (d-valid, d-plausible, i-valid, i-plausible) κ-non-emergent behavior if and only if L is NOT a (d-valid, d-plausible, i-valid, i-plausible) κ-emergent behavior.

In view of one of the algorithm shown above, we may assume that the k_(i) as well as k are all disjunctive. Moreover, given L∈

, L can be expressed in atomic CNF form. Thus, we can use these to determine whether L is a d-valid κ-emergent behavior or not. More precisely:

Algorithm 10.1. (Algorithm for determining whether L is a κ-emergent behavior or not) Given AKB κ₁, i=1, 2, . . . , n, k=∪_(i=1) ^(n)k_(i) and L∈

, determine whether L is a κ-emergent behavior or not.

-   -   1. Transform k and κ₁, i=1, 2, . . . , n into disjunctive AKB.     -   2. Express L in atomic CNF form over κ and over κ₁, i=1, 2, . .         . , n.     -   3. Find {grave over (σ)}_(κ)(L) and {grave over (σ)}_(κ) _(i)         (L), i=1, 2, . . . , n, using {grave over (σ)}_(κ)(L₁)∪{grave         over (σ)}_(κ)(L₂)={grave over (σ)}_(κ)(L₁∨L₂),and {grave over         (σ)}_(κ)(L₁)∩{grave over (σ)}_(κ)(L₂)={grave over         (σ)}_(κ)(L₁∧L₂).     -   4. Use the definition to determine whether L is a κ-emergent         behavior or not, and output the result.

Step 3 in the above algorithm requires the values of {grave over (σ)}_(κ)(P) and {grave over (σ)}_(κ) _(i) (P), i=1, 2, . . . , n, for P∈

_(κ). To facilitates matters and avoid repeated calculations, we can predetermine all these values and store them for easy access later. In this manner, Algorithm 10.1 can work quickly when needed.

The cases involving d-plausible, i-valid, and i-plausible can be processed in similar manners.

Section 11. Conjectures and Scientific Discoveries—Missing Link Missing Link Hypothesizer subcomponent of Augmented Analyzer (AA)

Conjectures, see discussion by Karl Popper, “Conjectures and Refutations: The Growth of Scientific Knowledge,” Routledge, 1963, play a very important role in scientific and other discoveries. Conjectures maybe derived from experiences, educated guesses, findings from similar or related problems, preliminary experimental outcomes, etc. In general, they provide the missing links to clues in the discoveries. However, for most discoveries, there might be many clues or conjectures and it may be expensive to pursue the clues in general and/or to pursue the clues individually (as a matter of fact, it might be very costly even to pursue just a single clue). We shall show how AKBs, and similarly structured knowledge bases and systems, such as CBKBs, can be used to simplify and accelerate the search for the missing links.

Let κ be an AKB and m a κ-measure.

-   -   1. A conjecture over κ is an object of the form E→L where L∈         and E⊆U but m(E) may or may not be known, or subject to changes.     -   2. Let λ be a collection of conjectures over κ. Then κ∪λ is         λ-conjectured AKB wrt κ. In this case, we shall refer to κ∪λ as         a conjectured AKB.

We shall occasionally refer to λ alone as the conjectures.

In our discussion of conjectured AKBs, we are usually not concern with the measures. Thus, we shall view conjectured AKBs as ordinary AKBs. When the measures become part of the discussions, then the measures will be stated explicitly.

The central elements for finding the missing links are either the sets Γ or the sets Φ. In the first case, deductive inference is used, while in the second case, inductive inference is used. In either case, the measures need not be specified.

Since σ and ϕ are derive solely from Σ and Φ, respectively, therefore, they are not affected by the measures either.

Because of the flexibility of the AKB, we can have conjectures (E₁→L) and (E₂→L), both in λ. The two essentially referred to the same conjecture L. The difference between them is the set associated with L, i.e., E₁ and E₂. This allows us to specify different constraints involving E₁ and E₂.

Let κ be an AKB and λ a collection of conjectures over κ. Let L∈

be the knowledge we want to assert. If σ_(κ∪)

(L) does not contain any element in λ, then none of the conjectures will help in the establishment of L. In other words, the missing links are still out of reach, especially if the extended measure {tilde over (m)}(σ_(κ∪λ)(L)) is smaller than desired. Modifications or additions of the conjectures are therefore indicated.

Similar conclusions follow if we replaced σ in the above by σ, ϕ or ϕ.

If some of the conjectures appeared in σ_(κ∪λ)(L), then it may be useful to closely examined Σ_(κ∪λ)(L) and/or Φ_(κ∪λ)(L). We shall concentrate on Σ_(κ∪λ)(L) for the time being.

Let κ be an AKB and ω∈{circumflex over (κ)}.

-   -   1. ω is a conjunction over κ if and only if ω can be expressed         in the form μ₁∧μ₂∧ . . . ∧μ_(n) where n≥1 and for all i≤n,         μ_(i)∈κ.     -   2. ω is a disjunction over κ if and only if ω can be expressed         in the form μ₁∨μ₂∨ . . . ∨μ_(n) where n≥1 and for all i≤n,         μ_(i)∈κ.

Let κ be an AKB and for all i≤n, μ_(i)∈κ.

-   -   1. Let ω=μ₁∧μ₂∧ . . . ∧μ_(n) be a conjunction over κ. ω is         F-minimal over κ if and only if ω=F and if any of the μ_(i)'s is         removed from ω, then α≠F     -   2. Let ω=μ₁∨μ₂∨ . . . ∨μ_(n) be a disjunction over κ. ω is         T-minimal over κ if and only if ω=T and if any of the μ_(i)'s is         removed from ω, then ω≠T.

Let κ be an AKB γ_(κ)is the collection of all F-minimal conjunctions over κ and γ _(κ)is the collection of all T-minimal disjunctions over κ.

In the rest of the Section, we shall let π=k∪λ where κ is an AKB and λ is a collection of conjectures over κ.

Let μ∈γ_(λ). Then

-   -   1. p_(π)(μ)=∨_(ω∈γ) _(π) _(, ω) _(λ) _(=μ)ω_(k)·p_(π)(μ) will be         referred to as the potential of μ over π     -   2. μ is consequential over κ if and only if         l(p_(π)(μ))⊇σ_(κ)(F). Otherwise, μ is inconsequential over κ.     -   3. Let μ₀∈λ. μ₀ is consequential over κ if and only if μ₀         occurred in some μ₁∈γ_(λ) where μ₁ is consequential over κ.         Otherwise, μ₀ is inconsequential over κ.

Let L∈

, k_(L)=k∪{(U→L′)}, π_(L)=k_(L)∪λ, and μ∈γ_(λ).

-   -   1. μ is L-consequential over κ if and only if l(p_(π) _(L)         (μ))⊇σ_(κ)(L). Otherwise, μ is L-inconsequential over κ.     -   2. Let μ₀∈λ. μ₀ is L-consequential over κ if and only if μ₀         occurred in some μ₁∈γ_(λ) where μ₁ is L-consequential over κ.         Otherwise, μ₀ is L-inconsequential over κ.

Let L∈

. If the conjecture μ is L-inconsequential over κ, then the establishment of the conjecture μ will not improve the validity of L over κ. Therefore, for simplicity, μ will be eliminated from λ. Moreover, some of the μ∈γ_(λ) may be L-inconsequential, thus, we shall define γ _(π) _(L) to consist of all μ∈γ_(λ) which are L-consequential over κ. If γ _(π) _(L) =∅, then none of the conjectures in λ will provide the missing links for L.

If m is a k-measure and {tilde over (m)} an extension of m, then the elements in γ _(π) _(L) may be sorted in descending order of {tilde over (m)}(r(p_(π) _(L) (μ)) where μ∈γ _(π) _(L) . In this case, it provides a priority list for examining the various conjectures given in λ that supports L.

The discussions so far were restricted to using the validity of L. Parallel results can be derived using plausibility of L. In this case, we let κ_(L)=k∪{(U→L)}, and π_(L)=k_(L)∪λ.

In the above discussions, only deductive reasoning were used. However, the missing links can also be found using inductive inference by applying the results given above to π.

Some example benefits according to the described embodiments include:

A new and innovative process—Augmented Exploration or A-Exploration—for working with Big Data by utilizing Big Data's own power and exploiting its various unique properties. A-Exploration has the capabilities to continuously uncover, track, understand, analyze, manage, and/or utilize any desired information and knowledge, as well as, oversee, regulate and/or supervise the development, progression and/or evolvement of these information and knowledge.

The embodiments permit organizations and policy makers to employ A-Exploration to address, in real-time, and mitigate considerable challenges to capture the full benefit and potential of Big Data and Beyond. It is comprised of many sequential and parallel phases and/or sub-systems required to control and handle its various operations.

The embodiments automate the overseeing and providing possible roadmap for management, utilization and projection of the outcomes and results obtained via A-Exploration.

The embodiments provide real-time capabilities for handling health care, public sector administration, retail, manufacturing, personal location data, unfolding stories, knowledge/information sharing and discovery, personal assistance, as well as, any fields that deals with information and/or knowledge—which covers virtually all areas in this Big Data era.

The embodiments enable and empower A-Exploration through the extensions, expansions, integrations and supplementations to create and establish diverse methods and functionalities to simplify and solve virtually all problems associated with Big Data, Information and Knowledge, including but not limited to:

Continuously Uncover and Track the Desired Information and Knowledge.

-   -   1. Analysis and Deep Understanding of the Information/Knowledge.     -   2. Manage and Supervise the Evolution of the         Information/Knowledge.     -   3. Enable and Generate Projection/Forecasting and Abduction.     -   4. Detect and Anticipate Emergent Events and Behaviors.     -   5. Locate and Unearth the Missing Links between the Current         Situations and the Eventual Desired Solutions.

The embodiments provide a new unification process for use with Augmented Knowledge Base AKB which incorporates Anytime-Anywhere methodologies, and allows heuristics to be employed to speed-up the process and improve the results. This new process can also be extended to include tags to identify the sources of the results.

The embodiments provide a method for purifying an object ω in the knowledge base by removing all target inconsistencies (as determined by application criteria) contained in r(ω), comprising of transforming the object into disjunctive normal form and then removing all conjunctions in r(ω) which are equivalent to FALSE. This will be referred to as purified object.

The embodiments provide constructing purified validity and purified plausibility by using purified objects from the knowledge base.

The embodiments provide a new unification method for any AKBs, consistent or otherwise, where inconsistencies are approximately completely ignored in the unification process; and inconsistencies are handled directly without having to determine them in advanced or removing them first.

The embodiment provide the purified validity and purified plausibility of L, which may be determined by repeated applications of the decomposition rules including:

-   -   1. the purified validity of the disjunction L₁∨L₂ is the union         of the individual purified validity;     -   2. the purified validity of the conjunction L₁∧L₂ is the         intersection of the individual purified validity;     -   3. the purified plausibility of the disjunction L₁∨L₂ is the         union of the individual purified plausibility, and     -   4. the purified plausibility of the conjunction L₁∧L₂ is the         intersection of the purified individual plausibility.

The embodiments analyses and deep understanding of many aspects of the information/knowledge comprising, among other things, of finding the best explanations for the various scenarios and possible courses/paths of how the information/knowledge evolved and/or progressed using deductive and/or inductive reasoning.

According to the embodiments, A-Exploration includes the following functionalities:

-   -   1. Identify the Organization and Chronology of the Information         and Knowledge.     -   2. Comprehend and Understand the Various Occurrences and         Happenings.     -   3. Resolve and Establish the Reasons and Explanations for the         Origins and Causes of the Various Incidences and Manifestations.     -   4. Determine the Causal Effects and Relationships of the         Information and Knowledge.     -   5. Cognizant and Knowledge of Areas Related or Linked to the         Information and Knowledge.     -   6. Appreciation and Conscious of the Likely Influences from         External Subjects or Areas.     -   7. Awareness and Recognition of the Ramifications.     -   8. Verification and Anticipation of the Possible Spread and         Impact to Other Areas.     -   9. Realization and Management of the Plausible Consequences.

The embodiments may be utilized to manage and supervise the evolvement of the many aspects of the information/knowledge, comprising deductive and/or inductive prediction, projection and/or forecasting. And to Project, Predict and Forecast the Effects and Consequences of the Various Actions and Activities. And to Instigate Measures to Supervise and Regulate the Information and Knowledge. And to Initiate Credible Actions to Mitigate and/or Redirect the Plausible Effects and Outcomes of the Information and Knowledge.

The embodiments provide the capabilities of exploring for the desired/ideal alternative outcomes. It may consist, among other things, of augmenting the knowledge bases with temporary, non-permanent and/or transitory knowledge fragments to the CBKBs or AKBs, e.g., Hypotheses Plug-Ins.

The embodiment can be implemented by having multiple A-Exploration systems, each having its own objectives. These A-Exploration systems can be used to build larger systems in hierarchical and/or other fashions, depending on their needs; allowing these systems to cross-pollinate to enhance their overall effects; and the various subsystems of different A-Exploration systems may be combined to optimize their functions.

The embodiments provide creating the building blocks needed to perform projection/forecasting of possible outcomes for any given hypothesis/situation, comprising of setting up the deductive consequent and inductive consequent.

The embodiment provide for projecting/forecasting the possible outcomes of a given hypothesis or situation, using deductive validity measures, in terms of Augmented Knowledge Bases or other similarly structured constructs, comprising: methods for determining projection/forecasting with respect to any atomic proposition; and methods for combining and merging the projection/forecasting of atomic propositions to construct the possible outcomes of projection/forecasting for a given proposition.

According the embodiments, the projection/forecasting uses deductive plausibility measures, inductive validity measures and/or inductive plausibility measures.

According to the embodiments, the possible projections/forecasting are ranked.

The embodiments provide for creating the building blocks needed to perform abduction or determine best explanations of a given observation or situation, comprising of setting up the deductive antecedent and inductive antecedent.

The embodiments provide for using abduction to determine the best explanations of a given observation or situation, using deductive validity measures, in terms of Augmented Knowledge Bases or other similarly structured constructs, comprising: ways for determining abduction with respect to any atomic proposition; and methods for combining and merging the abduction of atomic propositions to construct the possible explanation of given observations/propositions.

According to the embodiments, the abduction uses deductive plausibility measures, inductive validity measures and/or inductive plausibility measures.

According to the embodiments, the possible abductions are ranked.

According to the embodiments, the inverses of κ and are used to perform inductive reasoning.

The embodiments provide for speeding-up the projections/forecasting and/or abductions by storing the results of each atomic propositions for faster retrieval.

The embodiments provide for extending and expanding the projections/forecasting and/or abductions to higher-order projections/forecasting and abductions.

The embodiments determine and handle emergent events/behaviors of complex systems, including collections of AKBs.

The embodiments promote scientific and other discoveries comprising of locating and unearthing the missing links between the current situations and the eventual desired solutions.

The embodiments simplify and accelerate the search for the missing links in scientific and other discoveries comprising: introduction and formulation of conjectures which may consist of information and/or knowledge derived from experiences, educated guesses, findings from similar or related problems, preliminary experimental outcomes, etc.; expression of these conjectures in terms identifiable by the selected knowledge base to permit them to be part of the reasoning process in the knowledge base; establishes the conjectured knowledge base consisting of the given knowledge base and the conjectures; employs the reasoning mechanism of the resulting knowledge base to determine which conjectures is consequential and ranked the consequential conjectures to prioritize the conjecture(s) to be examined first; and eliminates the inconsequential conjectures to improve the search of the missing links.

An apparatus, comprising a computer readable storage medium configured to support managing required data objects and hardware processor to executes the necessary methods and procedures.

According to an aspect of the embodiments of the invention, any combinations of one or more of the described features, functions, operations, and/or benefits can be provided. The word (prefix or suffix article) “a” refers to one or more unless specifically indicated or determined to refer to a single item. The word (prefix or suffix article) “each” refers to one or more unless specifically indicated or determined to refer to all items. A combination can be any one of or a plurality. The expression “at least one of” a list of item(s) refers to one or any combination of the listed item(s). The expression “all” refers to an approximate, about, substantial amount or quantity up to and including “all” amounts or quantities.

A computing apparatus, such as (in a non-limiting example) any computer or computer processor, that includes processing hardware and/or software implemented on the processing hardware to transmit and receive (communicate (network) with other computing apparatuses), store and retrieve from computer readable storage media, process and/or output data. According to an aspect of an embodiment, the described features, functions, operations, and/or benefits can be implemented by and/or use processing hardware and/or software executed by processing hardware. For example, a computing apparatus as illustrated in FIG. 8 can comprise a central processing unit (CPU) or computing processing system 804 (e.g., one or more processing devices (e.g., chipset(s), including memory, etc.) that processes or executes instructions, namely software/program, stored in the memory 806 and/or computer readable storage media 812, communication media interface (network interface) 810 (e.g., wire/wireless data network interface to transmit and received data), input device 814, and/or an output device 802, for example, a display device, a printing device, and which are coupled (directly or indirectly) to each other, for example, can be in communication among each other through one or more data communication buses 808.

In addition, an apparatus can include one or more apparatuses in computer network communication with each other or other apparatuses and the embodiments relate to augmented exploration for big data involving one or more apparatuses, for example, data or information involving local area network (LAN) and/or Intranet based computing, cloud computing in case of Internet based computing, Internet of Things (IoT) (network of physical objects—computer readable storage media (e.g., databases, knowledge bases), devices (e.g., appliances, cameras, mobile phones), vehicles, buildings, and other items, embedded with electronics, software, sensors that generate, collect, search (query), process, and/or analyze data, with network connectivity to exchange the data), online websites. In addition, a computer processor can refer to one or more computer processors in one or more apparatuses or any combinations of one or more computer processors and/or apparatuses. An aspect of an embodiment relates to causing and/or configuring one or more apparatuses and/or computer processors to execute the described operations. The results produced can be output to an output device, for example, displayed on the display or by way of audio/sound. An apparatus or device refers to a physical machine that performs operations by way of electronics, mechanical processes, for example, electromechanical devices, sensors, a computer (physical computing hardware or machinery) that implement or execute instructions, for example, execute instructions by way of software, which is code executed by computing hardware including a programmable chip (chipset, computer processor, electronic component), and/or implement instructions by way of computing hardware (e.g., in circuitry, electronic components in integrated circuits, etc.) collectively referred to as hardware processor(s), to achieve the functions or operations being described. The functions of embodiments described can be implemented in a type of apparatus that can execute instructions or code.

More particularly, programming or configuring or causing an apparatus or device, for example, a computer, to execute the described functions of embodiments of the invention creates a new machine where in case of a computer a general purpose computer in effect becomes a special purpose computer once it is programmed or configured or caused to perform particular functions of the embodiments of the invention pursuant to instructions from program software. According to an aspect of an embodiment, configuring an apparatus, device, computer processor, refers to such apparatus, device or computer processor programmed or controlled by software to execute the described functions.

A program/software implementing the embodiments may be recorded on a computer-readable storage media, e.g., a non-transitory or persistent computer-readable storage media. Examples of the non-transitory computer-readable media include a magnetic recording apparatus, an optical disk, a magneto-optical disk, and/or volatile and/or non-volatile semiconductor memory (for example, RAM, ROM, etc.). Examples of the magnetic recording apparatus include a hard disk device (HDD), a flexible disk (FD), and a magnetic tape (MT). Examples of the optical disk include a DVD (Digital Versatile Disc), DVD-ROM, DVD-RAM (DVD-Random Access Memory), BD (Blue-ray Disk), a CD-ROM (Compact Disc-Read Only Memory), and a CD-R (Recordable)/RW. The program/software implementing the embodiments may be transmitted over a transmission communication path, e.g., a wire and/or a wireless network implemented via hardware. An example of communication media via which the program/software may be sent includes, for example, a carrier-wave signal.

The many features and advantages of the embodiments are apparent from the detailed specification and, thus, it is intended by the appended claims to cover all such features and advantages of the embodiments that fall within the true spirit and scope thereof. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the inventive embodiments to the exact construction and operation illustrated and described, and accordingly all suitable modifications and equivalents may be resorted to, falling within the scope thereof. 

What is claimed is:
 1. A method for an apparatus including a memory and a processor coupled to the memory to augment at least one digitized data of a plurality of digitized data input from a plurality of computerized data sources d₁, d₂, . . . , d_(l) forming a first set of evidences U to represent a first knowledge base (KB) among a plurality of KBs, the method comprising: selecting a subset of concept graphs of nodes cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) from concept graphs cα₁, cα₂, . . . , cα_(n) according to a computable measure of consistency, inconsistency, and/or priority threshold between each cα_(j) in cα₁, cα₂, . . . , cα_(n) and each specification concept graph spec_(k) in spec₁, spec₂, . . . , spec_(m) of concept nodes of concepts and relation nodes generated according to the at least one digitized data representing the first KB, the concept graphs of nodes cα₁, cα₂, . . . , cα_(n) including concept nodes and relation nodes for corresponding obtained plurality of information and knowledge (IKs) α₁, α₂, . . . , α_(n) forming a second set of evidences U to represent a second KB among the plurality of KBs; generating knowledge fragment objects in form of concept fragments obtained for corresponding subset of concept graphs cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) , a knowledge fragment object among the knowledge fragment objects to store a mapping of values to first and second sets of evidences U, where A is a rule among rules A′ in at least the first and second KBs among the plurality of KBs, and E is a subset of the first and second sets of evidences U from the at least first and second KBs that supports the rule A, so that the rule A is supportable by the subset of evidences E, according to the concept fragments; and generating a new KB, adding into at least one KB among the plurality of KBs, and/or adding into the first and/or second KBs for the concept fragments, to include augmenting information objects of augmenting information by, creating objects in form ω=E→A from the concept fragments; computing for each object ω a validity (v) and a plausibility (p) based upon atomic propositions among the rules A′; obtaining relationship constraints

_(κ) in form of a plurality of set relations among a plurality of subsets of evidences E for the concept fragments; obtaining propositions

_(κ) for the plurality of concept fragments in form of logical relations from among the rules A′ in the at least first and second KBs and/or from the atomic propositions; computing a validity (v) and a plausibility (p) for a combination of the relationship constraints

_(κ) and the propositions

_(κ); and generating information tags to identify each object ω, each relationship constraint in

_(κ), and each proposition in

_(κ), to cause extending, by the augmenting information objects, at least a forecasting and/or an abduction based upon the concepts to a higher-order projection and/or abduction deductively and/or inductively in conjunction with the generated information tags, wherein the spec₁, spec₂, . . . , spec_(m) is information generated in response to a query of a data source d among the data sources d₁, d₂, . . . , d_(l) or a domain specification.
 2. The method according to claim 1, further comprising: generating and ranking new objects in the form ω=E→A based on validity and plausibility using a computerized process of forecasting according to the spec₁, spec₂, . . . , spec_(m).
 3. The method according to claim 1, further comprising: generating and ranking a plurality of the objects in the form ω=E→A based on validity and plausibility that support at least one object ω=E→A from a plurality of objects in the KBs corresponding to spec₁, spec₂, . . . , spec_(m) using a computerized process of abduction.
 4. The method according to claim 1, further comprising: executing based upon the spec₁, spec₂, . . . , spec_(m) any one or combination of processes of: computing, using deductive and/or inductive reasoning, a plurality of sequences of a plurality of the objects in the form ω=E→A to be indicative of evolving and/or progress of the plurality of the objects in the form ω=E→A corresponding to a plurality of IKs α₁, α₂, . . . , α_(n); computing a plurality of alternative outcomes on a plurality of the objects in the form ω=E→A using deductive projection forecasting and/or inductive projection forecasting; computing emergence on a plurality of the objects in the form ω=E→A; computing a plurality of new objects in the form ω=E→A that serve as unknown/missing links in the computerized reasoning processes for determining validity and plausibility; and generating results of the one or more corresponding processes executed.
 5. The method according to claim 1, wherein the generating the specification concept graphs of nodes spec₁, spec₂, . . . , spec_(m) includes computing, for the first KB, a plurality of new objects of form ω=E→A from the plurality of digitized data to serve as an alignment between the plurality of digitized data.
 6. The method according to claim 1, wherein the KBs include Bayesian Knowledge Bases (BKBs), Compound BKBs (CBKBs), Relational Databases (RDbs), Deductive Databases (DDbs), Augmented Knowledge-Bases (AKBs).
 7. The method according to claim 1, wherein the concepts are according to user input to control generation of the specification concept graphs.
 8. The method according to claim 1, wherein the information tags include one or a combination of information to indicate an audit of generation of the knowledge fragment objects, or provide an explanation of the computing.
 9. An apparatus, comprising: a memory; and a processor coupled to the memory and to, select a subset of concept graphs of nodes cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) from concept graphs cα₁, cα₂, . . . , cα_(n) according to a computable measure of consistency, inconsistency, and/or priority threshold between each cα_(j) in cα₁, cα₂, . . . , cα_(n) and each specification concept graph spec_(k) in spec₁, spec₂, . . . , spec_(m) of concept nodes of concepts and relation nodes generated according to at least one digitized data of a plurality of digitized data input from a plurality of computerized data sources d₁, d₂, . . . , d_(l) forming a first set of evidences U to represent a first knowledge base (KB) among a plurality of KBs, the concept graphs of nodes cα₁, cα₂, . . . , cα_(n) including concept nodes and relation nodes for corresponding obtained plurality of information and knowledge (IKs) α₁, α₂, . . . , α_(n) forming a second set of evidences U to represent a second KB among the plurality of KBs; generate knowledge fragment objects of concept fragments obtained for corresponding subset of concept graphs cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) , a knowledge fragment object among the knowledge fragment objects to store a mapping of values to first and second sets of evidences U, where A is a rule among rules A′ in at least the first and second KBs among the plurality of KBs, and E is a subset of the first and second sets of evidences U from the at least first and second KBs that supports the rule A, so that the rule A is supportable by the subset of evidences E, according to the concept fragments; generate a new KB, add into at least one KB among the plurality of KBs, and/or add into the first and/or second KBs for the concept fragments, to include augmenting information objects of augmenting information by, creating objects in form ω=E→A from the concept fragments; computing for each object ω a validity (v) and a plausibility (p) based upon atomic propositions among the rules A′; obtaining relationship constraints

_(κ) in form of a plurality of set relations among a plurality of subsets of evidences E for the concept fragments; obtaining propositions

_(κ) for the plurality of concept fragments in form of logical relations from among the rules A′ in the at least first and second KBs and/or from the atomic propositions; computing a validity (v) and a plausibility (p) for a combination of the relationship constraints

_(κ) and the propositions

_(κ); and generating information tags to identify each object ω, each relationship constraint in

_(κ), and each proposition in

_(κ), to cause extending, by the augmenting information objects, at least a forecasting and/or an abduction based upon the concepts to a higher-order projection and/or abduction deductively and/or inductively in conjunction with the generated information tags, wherein the spec₁, spec₂, . . . , spec_(m) is information generated in response to a query of a data source d among the data sources d₁, d₂, . . . , d_(l) or a domain specification.
 10. The apparatus according to claim 9, wherein the processor is to: generate and rank new objects in the form ω=E→A based on validity and plausability using a computerized process of forecasting according to the spec₁, spec₂, . . . , spec_(m).
 11. The apparatus according to claim 9, wherein the processor is to: generate and rank a plurality of the objects in the form ω=E→A based on validity and plausibility that support at least one object ω=E→A from a plurality of objects in the KBs corresponding to spec₁, spec₂, . . . , spec_(m) using a computerized process of abduction.
 12. The apparatus according to claim 9, wherein the processor is to, execute based upon the spec₁, spec₂, . . . , spec_(m) any one or combination of processes of, computing, using deductive and/or inductive reasoning, a plurality of sequences of a plurality of the objects in the form ω=E→A to be indicative of evolving and/or progress of the plurality of the objects in the form ω=E→A corresponding to a plurality of IKs α₁, α₂, . . . , α_(n); computing a plurality of alternative outcomes on a plurality of the objects in the form ω=E→A using deductive projection forecasting and/or inductive projection forecasting; computing emergence on a plurality of the objects in the form ω=E→A; computing a plurality of new objects in the form ω=E→A that serve as unknown/missing links in the computerized reasoning processes for determining validity and plausibility; and generate results of the one or more corresponding processes executed.
 13. The apparatus according to claim 9, wherein the generating the specification concept graphs of nodes spec₁, spec₂, . . . , spec_(m) includes computing, for the first KB, a plurality of new objects of form ω=E→A from the plurality of digitized data to serve as an alignment between the plurality of digitized data.
 14. The apparatus according to claim 9, wherein the KBs include Bayesian Knowledge Bases (BKBs), Compound BKBs (CBKBs), Relational Databases (RDbs), Deductive Databases (DDbs), Augmented Knowledge-Bases (AKBs).
 15. The apparatus according to claim 9, wherein the at least one processor is to receive the concepts according to user input to control generation of the specification concept graphs.
 16. The apparatus according to claim 9, wherein the information tags include one or a combination of information to indicate an audit of generation of the knowledge fragment objects, or provide an explanation of the computing.
 17. An apparatus, comprising: a memory; and a processor coupled to the memory and to, select a subset of concept graphs of nodes cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) from cα₁, cα₂, . . . , cα_(n) according to a computable measure of consistency, inconsistency and/or priority threshold between cα_(j) in cα₁, cα₂, . . . , cα_(n) and specification concept graph spec_(k) in spec₁, spec₂, . . . , spec_(m) of concept nodes of concepts and relation nodes generated according to at least one digitized data of a plurality of digitized data input from a plurality of computerized data sources d₁, d₂, . . . , d_(l) forming a first set of evidences U to represent a first knowledge base (KB) among a plurality of KBs, the concept graphs of nodes cα₁, cα₂, . . . , cα_(n) including concept nodes and relation nodes for corresponding obtained plurality of information and knowledge (IKs) α₁, α₂, . . . , α_(n) forming a second set of evidences U to represent a second KB among the plurality of KBs; and generate knowledge fragment objects of concept fragments obtained for corresponding subset of concept graphs cα_(i) ₁ , cα_(i) ₂ , . . . , cα_(i) _(h) , to include augmenting information objects by creating or adding into at least one KB among the KBs, a new object in form ω=E→A from the concept fragments, including a computed validity (v) and a plausibility (p) for a combination of relationship constraints

_(κ) for the concept fragments and obtained propositions

_(κ), for the concept fragments, wherein in the new object ω=E→A, A is a rule among rules A′ in at least the first and second KBs among the plurality of KBs, and E is a subset of the first and second sets of evidences U from the at least first and second KBs that supports the rule A, so that the rule A is supportable by the subset of evidences E, according to the concept fragments, and wherein the spec₁, spec₂, . . . , spec_(m) is information generated in response to a query of a data source d among the data sources d₁, d₂, . . . , d_(l) or a domain specification. 